How can bytes in char array represent integers?

So let's say I have char array that I read from binary file (like ext2 formatted filesystem image file).

Now I need to read integer starting at offset byte 1024(<--that's the offset from start of data). Is there any neat way of doing it. The integer could be any number. So I believe can be represented in integer size of 4 byte on my system (x86-64).

I believe I need to use strtol like:

/* Convert the provided value to a decimal long */
char *eptr=malloc(4);// 4 bytes becuase sizeof int is 4 bytes
....
int valread=read(fd,eptr,4);//fd is to ext2 formatted image file (from file system)
result = strtol(eptr, &v, 10);

The above is long so is this the number to represent a integer 32 bit?

Should eptr be null terminated?

Is this correct or not?

2 answers

  • answered 2022-05-04 10:19 Some programmer dude

    In the case of strtol it might be easier to follow along by seeing some code. So here a very simplified strtol kind of function:

    int string_to_int(const char *string)
    {
        // The integer value we construct and return
        int value = 0;
    
        // Loop over all the characters in the string, one by one,
        // until the string null-terminator is reached
        for (unsigned i = 0; string[i] != '\0'; ++i)
        {
            // Get the current character
            char c = string[i];
    
            // Convert the digit character to its corresponding numeric value
            int c_value = c - '0';
    
            // Add the characters numeric value to the current value
            value = (value * 10) + c_value;
    
            // Note the multiplication with 10: That's because decimal numbers are base 10
        }
    
        // Now the string have been converted to its decimal integer value, return it
        return value;
    }
    

    If we call it with the string "123" and unroll the loop what's happening is this:

    // First iteration
    char c = string[0];  // c = '1'
    int c_value = c - '0';  // c_value = 1
    value = (value * 10) + c_value;  // value = (0 * 10) + 1 = 0 + 1 = 1
    
    // Second iteration
    char c = string[0];  // c = '2'
    int c_value = c - '0';  // c_value = 2
    value = (value * 10) + c_value;  // value = (1 * 10) + 2 = 10 + 2 = 12
    
    // Third iteration
    char c = string[0];  // c = '3'
    int c_value = c - '0';  // c_value = 3
    value = (value * 10) + c_value;  // value = (12 * 10) + 3 = 120 + 3 = 123
    

    In the fourth iteration we reach the string null-terminator and the loop ends with value being equal to the int value 123.

    I hope this makes it a little clearer about how string to number conversions are working.


    While the above is for strings, if you read the raw binary bits of an existing int value, then you should not call strtol because the data isn't a string.

    Instead you basically interpret the four bytes as a single 32-bit value.

    Unfortunately it's not easy to explain how these bits are interpreted without knowing a thing or two about endianness.

    Endianness is how the bytes are ordered to make up the integer value. Taking the (hexadecimal) number 0x01020304 they can be stored either as 0x01, 0x02, 0x03 and 0x04 (this is called big-endian); Or as 0x04, 0x03, 0x02 and 0x01 (this is called little-endian).

    On a little-endian system (your normal PC-like system) say you have an array like this:

    char bytes[4] = { 0x04, 0x03, 0x02, 0x01 };
    

    then you could copy it into an int:

    int value;
    memcpy(&value, bytes, 4);
    

    and that will make the int variable value equal to 0x01020304.

  • answered 2022-05-04 10:20 chux - Reinstate Monica

    I have char array that I read from binary file (like ext2 formatted filesystem image file).

    Open the file in binary mode

    const char *file_name = ...;
    FILE *infile = fopen(file_name, "rb");  // b is for binary
    if (infile == NULL) {
      fprintf(stderr, "Unable to open file <%s>.\n", file_name);
      exit(1);
    }
    

    I need to read integer starting at offset byte 1024 ...

    long offset = 1024; 
    if (fseek(infile, offset, SEEK_SET)) {
      fprintf(stderr, "Unable to seek to %ld.\n", offset);
      exit(1);
    } 
    

    So I believe can be represented in integer size of 4 byte on my system

    Rather than use int, which may differ from 4-bytes, consider int32_t from <stdint.h>.

    int32_t data4;
    if (fread(&data4, sizeof data4, 1, infile) != 1) {
      fprintf(stderr, "Unable to read data.\n");
      exit(1);
    } 
    

    Account for Endian.

    As file data is little-endian, convert to native endian. See #include <endian.h>.

    data4 = le32toh(data4);
    

    Clean up when done

    // Use data4
    
    fclose(infile);
    

    believe I need to use strtol like

    No. strtol() examines a string and returns a long. File data is binary and not a string.

How many English words
do you know?
Test your English vocabulary size, and measure
how many words do you know
Online Test
Powered by Examplum