Jump to content

Why can't strings have some pre-allocated bytes in C for storing the length?

Gat Pelsinger
Go to solution Solved by wanderingfool2,
18 hours ago, Gat Pelsinger said:

In C, we need to use strlen to get the length of the string because the null terminator which marks the end of the string is placed at the last. Wouldn't it have been better to have a few pre-allocated bytes of data at the start of the string which could store the string's length?

Well it's like @Eigenvektor pretty much said.

 

To add to that concept though, it adds overhead then (and forces that overhead onto people).  If you really needed that sort of thing, you could easily just store it as a struct yourself.

 

Traversing a list actually has minimal time penalty compared to the likelihood of needing the length. i.e. if you are trying a strcmp, you still need to loop the entire string...except now instead of just passing in a pointer to the string, you have to deal with passing a struct (which would take up 2 registers) or a pointer to the struct, and then you would have to dereference to get the pointer to the string.

 

Overall the above would add overhead (both in memory and CPU cycles) for no benefit.

 

Many of the string functions would then also behave that way, where you essentially are sacrificing the performance for the sake of knowing the length of the string.

 

You get into even more "inefficiencies" as well when you are working with constant strings and doing manipulation as such.

const char* quickexample = "abcdefghijklmnopqrstuvwxyz";
const char* target = "ground";

const char* somefunction() {
	const char* qe = quickexample;
    while(*qe != '\0' && cmp(eq, target) < 0)
    	qe++;
    return qe;
}

So notice the above, it's working on the raw string, if strings were required to have length, you wouldn't be able to do something like above...instead you would have to create a new string each time and do the comparison.  It gets worse since there is a const char*...so it gets trickier as well since lots of strings can be const as well...which makes passing it as a struct harder as well because then you have to effectively copy it each time.

In C, we need to use strlen to get the length of the string because the null terminator which marks the end of the string is placed at the last. Wouldn't it have been better to have a few pre-allocated bytes of data at the start of the string which could store the string's length?

Microsoft owns my soul.

 

Also, Dell is evil, but HP kinda nice.

Link to comment
Share on other sites

Link to post
Share on other sites

2 minutes ago, Gat Pelsinger said:

In C, we need to use strlen to get the length of the string because the null terminator which marks the end of the string is placed at the last. Wouldn't it have been better to have a few pre-allocated bytes of data at the start of the string which could store the string's length?

You can create a struct and do that.

Many other languages do have such abstractions over strings, such a Cpp.

FX6300 @ 4.2GHz | Gigabyte GA-78LMT-USB3 R2 | Hyper 212x | 3x 8GB + 1x 4GB @ 1600MHz | Gigabyte 2060 Super | Corsair CX650M | LG 43UK6520PSA
ASUS X550LN | i5 4210u | 12GB
Lenovo N23 Yoga

Link to comment
Share on other sites

Link to post
Share on other sites

20 minutes ago, Gat Pelsinger said:

Wouldn't it have been better to have a few pre-allocated bytes of data at the start of the string which could store the string's length?

No, because then it wouldn't be an array of characters, which is all a "string" is in C.

Don't ask to ask, just ask... please 🤨

sudo chmod -R 000 /*

Link to comment
Share on other sites

Link to post
Share on other sites

You have to keep in mind how old C is. Even today, it is used on embedded platforms where memory space is at a premium. How often do you need to know the length of a string vs how many bytes would that "waste"? You can always store the length alongside the char[], if you need it for a particular use case. It is a low level language only a small step above assembler, so it doesn't come with such abstractions included.

Remember to either quote or @mention others, so they are notified of your reply

Link to comment
Share on other sites

Link to post
Share on other sites

i keep seeing questions from you about similar things. look if you want more convenient programming language, why are you learning C?

MSI GX660 + i7 920XM @ 2.8GHz + GTX 970M + Samsung SSD 830 256GB

Link to comment
Share on other sites

Link to post
Share on other sites

5 hours ago, Eigenvektor said:

How often do you need to know the length of a string vs how many bytes would that "waste"?

Also consider how you'd generate that length number in the first place... you'd have to walk through the string at least once to find the terminator, even if you never actually need to know the length of that string in your program, so what would be the point? QOL functionality like this only makes sense if you already have layers of abstraction well above what C generally offers.

Don't ask to ask, just ask... please 🤨

sudo chmod -R 000 /*

Link to comment
Share on other sites

Link to post
Share on other sites

18 hours ago, Gat Pelsinger said:

In C, we need to use strlen to get the length of the string because the null terminator which marks the end of the string is placed at the last. Wouldn't it have been better to have a few pre-allocated bytes of data at the start of the string which could store the string's length?

Well it's like @Eigenvektor pretty much said.

 

To add to that concept though, it adds overhead then (and forces that overhead onto people).  If you really needed that sort of thing, you could easily just store it as a struct yourself.

 

Traversing a list actually has minimal time penalty compared to the likelihood of needing the length. i.e. if you are trying a strcmp, you still need to loop the entire string...except now instead of just passing in a pointer to the string, you have to deal with passing a struct (which would take up 2 registers) or a pointer to the struct, and then you would have to dereference to get the pointer to the string.

 

Overall the above would add overhead (both in memory and CPU cycles) for no benefit.

 

Many of the string functions would then also behave that way, where you essentially are sacrificing the performance for the sake of knowing the length of the string.

 

You get into even more "inefficiencies" as well when you are working with constant strings and doing manipulation as such.

const char* quickexample = "abcdefghijklmnopqrstuvwxyz";
const char* target = "ground";

const char* somefunction() {
	const char* qe = quickexample;
    while(*qe != '\0' && cmp(eq, target) < 0)
    	qe++;
    return qe;
}

So notice the above, it's working on the raw string, if strings were required to have length, you wouldn't be able to do something like above...instead you would have to create a new string each time and do the comparison.  It gets worse since there is a const char*...so it gets trickier as well since lots of strings can be const as well...which makes passing it as a struct harder as well because then you have to effectively copy it each time.

3735928559 - Beware of the dead beef

Link to comment
Share on other sites

Link to post
Share on other sites

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now

×