get and set values of a char array via pointer arithmetic in c

I am missing something regarding pointers and strings in c. I am trying to simply get and set an element of a character array in c that was created via a pointer. I can easily get each character via pointer arithmetic, but I can not set any of the elements via pointer arithmetic. Please see the example. What am I missing here? Isn't s1 in both examples the same? I am using mingw(gcc) on win10.

Example A) This works, s1 is printed as "abxd"

char *s1;
s1 = (char[]){'a','b','c','d','\0'};

Example B) This does not work, it just crashes.

char *s1;
s1 = "abcd";
*(s1+2)='x';    //this is the problem, can get but can not set

Edit: based on the comments received regarding example B using static memory and being impossible to edit. So basically this means that I have to use malloc(heap memory, eg. C) or defining the array on stack memory (eg. D) if I want to edit the string, correct?.

Example C) - works

char *s1;
s1 = (char*)malloc((4+1)*sizeof(char));
s1 = strcpy(s1,"abcd");
*(s1+2)='x';  //or s1[2] = 'x'

Example D) - works

char s1[4];  // would have thought need to be min of s1[5]
s1 = strcpy(s1,"abcd");
*(s1+2)='x';   // or s1[2]='x';

2 answers

  • answered 2018-07-11 04:38 David C. Rankin

    Let's look at your examples and make sure you know why what is happening is happening. But first, a quick review of pointers to make sure we are on the same page:

    A Pointer & Pointer Arithmetic

    A pointer is simply a normal variable that holds the address of something else as its value. In other words, a pointer points to the address where something else can be found. Where you normally think of a variable holding an immediate values, such as int a = 5;, a pointer would simply hold the address where 5 is stored in memory, e.g. int *b = &a;. It works the same way regardless what type of object the pointer points to. It is able to work that way because the type of the pointer controls the pointer arithmetic, e.g. with a char * pointer, pointer+1 point to the next byte, for an int * pointer (normal 4-byte integer), pointer+1 will point to an offset 4-bytes after pointer. (so a pointer, is just a pointer.... where arithmetic is automatically handled by the type)

    What am I doing in Example A?

    Your initialization is key to why Example A works and why Example B crashes. Example A uses a compound literal to initialize s1 so s1 points to the first character 'a' in "abcd" in modifiable memory. The compound-literal was introduced in C99, but gcc provides the compound-literal as an extension to C89 as well. In Example A you use:

    s1 = (char[]){'a','b','c','d','\0'};

    which is equivalent to

    s1 = (char[]){ "abcd" };

    The compound literal is (type){ ..initializer.. }, the key part being the (type) which works as a cast of the initializer value to that type. In your example A "abcd" is cast to char[] (a character array) which you can freely modify.

    Why does Example B Crash?

    On the other hand:

    s1 = "abcd";

    initializes s1 to a string-literal. A string-literal is created in read-only memory by most Operating Systems (generally in the .rodata section of the executable). See: Why are C string literals read-only? for a historical view. You cannot modify values in read-only memory and attempting to do so generally results in a SEGFAULT (as you have probably found).

    You were right in your comment on Example D!

    char s1[4];

    Creates a character array with space for 4-characters (ASCII). When you call strcpy (s1, "abcd"); you are attempting to copy 1-more character than will fit:

     1   2   3   4   5

    This results in Undefined Behavior and can result in exploitable buffer-overflow. From man 3 strcpy,

    If the destination string of a strcpy() is not large enough, then anything might happen. Overflowing fixed-length string buffers is a favorite cracker technique for taking complete control of the machine. Any time a program reads or copies data into a buffer, the program first needs to check that there's enough space. This may be unnecessary if you can show that overflow is impossible, but be careful: programs can get changed over time, in ways that may make the impossible possible.

    So just as you allocated (4+1) chars/bytes in Example C, you need at least (4+1) chars/bytes of storage in s1 in Example D.

    Remember, every C-library str... function requires a nul-terminated string. When you create a character-array, it is your responsibility to insure that it is nul-terminated to make it a string in C. If it's not nul-terminated, then it is simply an array of characters -- and any time you fail to pass a nul-terminated string to a function expecting one, the function will not know when to stop reading and will happily stray off reading out-of-bounds until it happens to stumble upon a zero-byte, or SEGFAULTS, whichever occurs first.

    Look things over and digest them and let me know if you have further questions. (and add a '\n' to your printf format string (e.g. "%s\n") so that a newline is output -- at the very least on your last call to make your program POSIX compliant)

  • answered 2018-07-11 06:20 P__J__

    The first example creates the char array in the read/write memory which you can modify. The second one the pointer to the read only char array. When you try to modify the read only memory location, you get the error.

    You can also create the array in the read/write memory by char x[] ="1234";