Jump to content

Function in C - Why "malloc( )" required for creating char *

3.14159

Hello,

 

I was hoping someone could offer some insight on my code below. I am learning to code C and still learning the fundamentals.

 

Description of function: The following function written in C (written for a "codewars.com" puzzle) takes a string as an input and returns a fraction indicating how many characters are not "between" 'a' and 'm'. These characters are called "errors". The fraction is formatted and returned as a string ( error characters / total characters ).

 

My working solution is below, along with the non-working solution second.

 

I have a few questions regarding the solution:

 

1) Why does the pointer "out" need to be malloced?

    - my thought is because strings in the "stack" cannot be written, but I'm not certain if this is the reason (or if it's even true).

2) Is there a way to write this program without using malloc( ) ?

3) If malloc is used inside the function, how can I free the pointer "out"?

    - I read you should always free any memory allocated with malloc( )

 

link to problem description: "Printer Errors"

 

Thanks in advance. Any general comments on my code are always welcome.

 

-DR

 

note: The only difference is on line 8.

Working code:

#include <stdio.h>
#include <string.h>
#include <stdlib.h>

char* printerError(char *s) {
  
  int i = 0, errorCount = 0, totalChars;
  char * out = malloc(0);
  
  totalChars = strlen(s);
  
  for( ; i < totalChars; i++) {
      if (s[i] < 0x61 || s[i] > 0x6D) {          // This tests if s[i] is a char between 'a' and 'm'
          errorCount++;							 // If s[i] is not bewteen 'a' and 'm' it is an "error"
      }
  }
  
  sprintf(out, "%d/%d", errorCount, totalChars);
  return out;
  
}

 

Non-working code:

#include <stdio.h>
#include <string.h>
#include <stdlib.h>

char* printerError(char *s) {
  
  int i = 0, errorCount = 0, totalChars;
  char * out = "This is a string";
  
  totalChars = strlen(s);
  
  for( ; i < totalChars; i++) {
      if (s[i] < 0x61 || s[i] > 0x6D) {          // This tests if s[i] is a char between 'a' and 'm'
          errorCount++;							 // If s[i] is not bewteen 'a' and 'm' it is an "error"
      }
  }
  
  sprintf(out, "%d/%d", errorCount, totalChars);
  return out;
  
}

 

Link to comment
Share on other sites

Link to post
Share on other sites

C doesn't handle strings well, as you're finding out. Both of these functions have errors, just the first is run-time and the second is compile-time.

In the first version, you have 

char * out = malloc(0);

which might work with your compiler, but won't necessarily work with any C compiler. (Some compilers return a NULL pointer for malloc(0), which means that you'll get a segmentation fault when you use this in the sprintf call.) The problem is that you're not actually allocating any space here; technically, even though your compiler returns an apparently valid pointer, any use of it (such as in sprintf) is a buffer overflow error. (If you're malloc-ing anything else, you might overwrite it and cause errors.) To fix this, figure out how many chars you'll actually need (some math from the string length will get you there) and malloc that number. (i.e., if strlen returns a two-digit number, you need 6 chars [2 for the first number, 1 for the /, 2 for the second number, and 1 for the null char at the end]; if it's three-digits you need 8, etc.)

The other issue here is that by malloc-ing a char* and returning it from the function, you're putting the burden of free-ing it onto the calling code, and if that doesn't happen, you could get a memory leak.

 

Version 2, without the malloc, doesn't work because the pointer is pointing to a read-only string literal. Thus, when you try to change it (with sprintf), there is an error. A fix here is to use the fact that arrays and pointers are closely related, and use 

char out[16];

This will most probably allocate more characters than you actually need, but that's ok, it's better to have too many than too few. (From my math above, this allows up to a 7-digit number of chars in the original string, or up to 9,999,999 characters. You could adjust the 16 to allow whatever number size you think you'll need; however, it has to be set at compile-time, and cannot by dynamic. If you want it dynamic, you have to malloc it.) This allocates the space onto the stack instead of the heap, which means that it'll automatically be freed without a memory leak.

 

On a side note, once you know how long "out" can be (whether malloc-ed or statically assigned), you should use snprintf instead of sprintf to ensure there's no buffer overflow. (Example: if I use malloc(4), then try to print "1/10", that's 5 characters, and I've overflowed the allocated space. sprintf will let you do this; snprintf takes the 4 as an argument and will only allow up to 4 chars.)

 

In summary, here's your answers:

1) Why does the pointer "out" need to be malloced?

    It doesn't have to be, but that's the only way to get a dynamic array length. If you're ok with a static length, you can use e.g. char out[10]; instead.

2) Is there a way to write this program without using malloc( ) ?

    Yes, but only with a static array length, as above.

3) If malloc is used inside the function, how can I free the pointer "out"?

    - I read you should always free any memory allocated with malloc( )

    Yes, that's correct. Since you're returning the pointer, the calling function assumes the responsibility for freeing it. This is messy, and is often the cause of memory leaks. In this case, it might look something like:

char* output = printerError(myString);

// do something else with 'output'

free(output);

 

Link to comment
Share on other sites

Link to post
Share on other sites

4 minutes ago, fordy_rounds said:

The problem is that you're not actually allocating any space here; technically, even though your compiler returns an apparently valid pointer, any use of it (such as in sprintf) is a buffer overflow error. (If you're malloc-ing anything else, you might overwrite it and cause errors.)

Ha, I've been staring at this trying to figure out how the working code could actually work. Wouldn't this fix the sizing issue?

totalChars = strlen(s);
char * out = malloc(totalChars * sizeof(char));

 

Make sure to quote or tag me (@JoostinOnline) or I won't see your response!

PSU Tier List  |  The Real Reason Delidding Improves Temperatures"2K" does not mean 2560×1440 

Link to comment
Share on other sites

Link to post
Share on other sites

1 minute ago, JoostinOnline said:

Ha, I've been staring at this trying to figure out how the working code could actually work. Wouldn't this fix the sizing issue?


totalChars = strlen(s);
char * out = malloc(totalChars * sizeof(char));

 

It might, but consider corner cases. If 's' has, say, 20 characters, then you're allocating 20 chars when you only need at most 6. (That's ok; better too many than too few.) However, what happens if 's' is only 1-3 characters? You always need out to have at least 4 chars (because the string "1/3" needs 4 chars), but you would only allocate 1-3.

You can, however, use some math to figure it out. Count the number of digits in totalChars (a little Googling will find you some algorithms, but it's basically counting iterations while dividing by 10), then you need 2*numDigits+2 chars allocated.

Link to comment
Share on other sites

Link to post
Share on other sites

Interesting comments about malloc(0). Was playing around with different values in there and I was also surprised when zero worked. I left it in there just to see if it triggered any comments from you all!

 

Regarding the memory leak - it appears you are saying it is preferable to not put the burden on the function doing the calling. This makes sense to me, no need for a called function to require "clean up" after it's called.

 

However, I still couldn't get the code to run without malloc. Replacing line 8 with

char out[16];

occasionally passed one of the three "tests", but usually crashed. Here is the error message:

 

solution.c:20:10: warning: address of stack memory associated with local variable 'out' returned [-Wreturn-stack-address]
  return out;
         ^~~
1 warning generated.

It looks like the "out" variable (when declared as "char out[16]") cannot be used as a return value as it is local to the function (??). Just for kicks I wrote a script and compiled on my computer and got the same error message.

 

edit: the message is just a warning but the function doesn't actually return any value when called.

Edited by 3.14159
clarification
Link to comment
Share on other sites

Link to post
Share on other sites

9 minutes ago, 3.14159 said:

Interesting comments about malloc(0). Was playing around with different values in there and I was also surprised when zero worked. I left it in there just to see if it triggered any comments from you all!

 

Regarding the memory leak - it appears you are saying it is preferable to not put the burden on the function doing the calling. This makes sense to me, no need for a called function to require "clean up" after it's called.

 

However, I still couldn't get the code to run without malloc. Replacing line 8 with



char out[16];

occasionally passed one of the three "tests", but usually crashed. Here is the error message:

 



solution.c:20:10: warning: address of stack memory associated with local variable 'out' returned [-Wreturn-stack-address]
  return out;
         ^~~
1 warning generated.

It looks like the "out" variable (when declared as "char out[16]") cannot be used as a return value as it is local to the function (??). Just for kicks I wrote a script and compiled on my computer and got the same error message.

 

edit: the message is just a warning but the function doesn't actually return any value when called.

Yes, the reason it doesn't work is because in C an array (eg. int v[]) is almost the same as a pointer (int *p), but the array is allocated on the stack and not on the heap.

 

When you declare a vector int v[] = {0, 1, 2, 3, 4}, v is a pointer to the location in memory where the beginning of the array is found. You can view the exact memory location if you do something like printf("%p\n", v).

 

To be exact, the only difference you will see between a declaration like int *v or int v[] is that int *v sizeof returns 4/8 bytes (the size of a memory address, 4 bytes for a 32 bit architecture and 8 bytes for 64 bits), while sizeof for int v[] returns the entire bytes occupied by the array (for an int v[4], that would be 4 int-s of 4 bytes each so 16 byes)

 

An array such as int v[] is allocated on the stack. You can think of it as if each function call has its own stack portion and all local variables belong on the local portion of the stack. When you declare a variable such as int x inside a function foo, it's allocated on the local stack of the function f. It is not visible outside the function, eg. you can't access the variable x in another function g.

 

Malloc is a way of doing dynamic allocation in C, meaning that you allocate memory on the heap and not on the stack. That way, you can pass results from one function to another by returning pointers to variables in memory that doesn't go out of scope when the function ends.

 

What happens in your code if you allocate statically (on the stack), is that after your function ends, its variables on the stack are sometimes replaced by other variables of other functions. Sometimes it doesn't, and the function works, but it's not something you can count on.

 

If you need more info I would be glad to help.

 

 

 

i5 4670k @ 4.2GHz (Coolermaster Hyper 212 Evo); ASrock Z87 EXTREME4; 8GB Kingston HyperX Beast DDR3 RAM @ 2133MHz; Asus DirectCU GTX 560; Super Flower Golden King 550 Platinum PSU;1TB Seagate Barracuda;Corsair 200r case. 

Link to comment
Share on other sites

Link to post
Share on other sites

20 minutes ago, Nineshadow said:

Malloc is a way of doing dynamic allocation in C, meaning that you allocate memory on the heap and not on the stack. That way, you can pass results from one function to another by returning pointers to variables in memory that doesn't go out of scope when the function ends.

I follow - thank you everyone for the explanations. Will play around with this some more, or maybe just move on I'm getting tired of thinking about this one lol.

Link to comment
Share on other sites

Link to post
Share on other sites

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now

×