Jump to content

C - Randomized order String

CJPowell27

hey guys I'm trying to have the phrase "The quick brown fox jumps over the lazy dog" displayed in a random order and each word only once.

The program I'm writing is going to display a random word then ask the user to type that word, and repeat until all words are done. I'm not familiar with C but all I know is I would have to rand. could anyone give me some pointers? Any help is appreciated. 

i5 4670k| Asrock H81M-ITX| EVGA Nex 650g| WD Black 500Gb| H100 with SP120s| ASUS Matrix 7970 Platinum (just sold)| Patriot Venom 1600Mhz 8Gb| Bitfenix Prodigy. Build log in progress 

Build Log here: http://linustechtips.com/main/topic/119926-yin-yang-prodigy-update-2-26-14/

Link to comment
Share on other sites

Link to post
Share on other sites

10 minutes ago, CJPowell27 said:

hey guys I'm trying to have the phrase "The quick brown fox jumps over the lazy dog" displayed in a random order and each word only once.

The program I'm writing is going to display a random word then ask the user to type that word, and repeat until all words are done. I'm not familiar with C but all I know is I would have to rand. could anyone give me some pointers? Any help is appreciated. 

Create a linked list of those words, then have the random number generator generate how many items to traverse in the linked list. Pop the word out of the list, then repeat.

 

You could also try making an array of the words and randomly picking it, but you'd need a way to track which words you already used.

 

EDIT: Because I suggested a thing that even I hate to use, this article has a good way of removing something from a linked list: https://medium.com/@bartobri/applying-the-linus-tarvolds-good-taste-coding-requirement-99749f37684a#.mktusdrr5

Link to comment
Share on other sites

Link to post
Share on other sites

3 hours ago, M.Yurizaki said:

Create a linked list of those words, then have the random number generator generate how many items to traverse in the linked list. Pop the word out of the list, then repeat.

 

You could also try making an array of the words and randomly picking it, but you'd need a way to track which words you already used.

 

EDIT: Because I suggested a thing that even I hate to use, this article has a good way of removing something from a linked list: https://medium.com/@bartobri/applying-the-linus-tarvolds-good-taste-coding-requirement-99749f37684a#.mktusdrr5

So you're saying to make a linked list where each node has a single word, for the second part are you saying to have the number generator say how many times to traverse, use that for and then remove it until the list is empty?

This is my first time working with C so should I just make a build function to create the list?

i5 4670k| Asrock H81M-ITX| EVGA Nex 650g| WD Black 500Gb| H100 with SP120s| ASUS Matrix 7970 Platinum (just sold)| Patriot Venom 1600Mhz 8Gb| Bitfenix Prodigy. Build log in progress 

Build Log here: http://linustechtips.com/main/topic/119926-yin-yang-prodigy-update-2-26-14/

Link to comment
Share on other sites

Link to post
Share on other sites

1 minute ago, CJPowell27 said:

So you're saying to make a linked list where each node has a single word, for the second part are you saying to have the number generator say how many times to traverse, use that for and then remove it until the list is empty?

Pretty much.

1 minute ago, CJPowell27 said:

This is my first time working with C so should I just make a build function to create the list?

It would be useful to generalize the functions you need so you can expand this later. So you will need something to start the list, then add, remove, and traverse the list as necessary. The biggest hurdle though is you're going to have to deal with memory allocation, or the malloc call. Which means you need an accompanying free call as well unless you want a memory leak while the program is running. If you just want the program running, you can deal without a free call for now, and then figure it out later.

 

The other problem is working with strings (or the words), because the way C treats them is strange to say the least. Though I'm sure the string library should help you there.

 

I do feel like this is a little too advanced for someone starting in C. However, it'll expose you to the wonders it has right off the bat without getting too complex code-wise.

Link to comment
Share on other sites

Link to post
Share on other sites

2 minutes ago, M.Yurizaki said:

Because I feel like you're being thrown into the wolves, I'm willing to help out over PM. But you have to take the swings at bat :P

I have some knowledge in C++ so I might be able to figure it out but I will PM you if I get stuck, thank you!

i5 4670k| Asrock H81M-ITX| EVGA Nex 650g| WD Black 500Gb| H100 with SP120s| ASUS Matrix 7970 Platinum (just sold)| Patriot Venom 1600Mhz 8Gb| Bitfenix Prodigy. Build log in progress 

Build Log here: http://linustechtips.com/main/topic/119926-yin-yang-prodigy-update-2-26-14/

Link to comment
Share on other sites

Link to post
Share on other sites

Why not just make an array of the words and shuffle them? That sounds easier than randomly transversing a linked list. 

http://stackoverflow.com/questions/6127503/shuffle-array-in-c

15" MBP TB

AMD 5800X | Gigabyte Aorus Master | EVGA 2060 KO Ultra | Define 7 || Blade Server: Intel 3570k | GD65 | Corsair C70 | 13TB

Link to comment
Share on other sites

Link to post
Share on other sites

You have a phrase made of n words.

Assign each word an index 1 or 0 through n in the order they appear in the phrase.

Then generate a permutation of the set of indices using a randomized variant of backtracking. Output the first solution you find and exit the backtracking algorithm.

Or, instead of backtracking, use the Fisher-Yates shuffle to generate that random permutation.

 

i5 4670k @ 4.2GHz (Coolermaster Hyper 212 Evo); ASrock Z87 EXTREME4; 8GB Kingston HyperX Beast DDR3 RAM @ 2133MHz; Asus DirectCU GTX 560; Super Flower Golden King 550 Platinum PSU;1TB Seagate Barracuda;Corsair 200r case. 

Link to comment
Share on other sites

Link to post
Share on other sites

12 hours ago, Blade of Grass said:

Why not just make an array of the words and shuffle them? That sounds easier than randomly transversing a linked list. 

http://stackoverflow.com/questions/6127503/shuffle-array-in-c

Additionally, if you wanted to use an arbitrary phrase (ideally you always want to design for arbitrary values, even if it ultimately the initial value is hard coded) instead of hard coding it, you can look into calloc()realloc(), and strtok() to split a string into an array.

 

The original input string goes into a buffer, so we can free it all at once when were finished.

 

Now, we make a dynamic array to hold the strings.  Since a string is just a char*, an array of strings is a char**.  We keep track of the current size and and how many we've put into it with a pair of integers (eg. placed 7 strings into an array that can hold 10).  Create the initial array with calloc(), expand the array when needed with realloc(), and get the individual words with strtok().

 

Then you shuffle and proceed as normal.

 

Link to comment
Share on other sites

Link to post
Share on other sites

50 minutes ago, Yamoto42 said:

Additionally, if you wanted to use an arbitrary phrase (ideally you always want to design for arbitrary values, even if it ultimately the initial value is hard coded) instead of hard coding it, you can look into calloc()realloc(), and strtok() to split a string into an array.

 

The original input string goes into a buffer, so we can free it all at once when were finished.

 

Now, we make a dynamic array to hold the strings.  Since a string is just a char*, an array of strings is a char**.  We keep track of the current size and and how many we've put into it with a pair of integers (eg. placed 7 strings into an array that can hold 10).  Create the initial array with calloc(), expand the array when needed with realloc(), and get the individual words with strtok().

 

Then you shuffle and proceed as normal.

 

OP would still have to write something from scratch since there's no dynamic array facilities in C.

 

13 hours ago, Blade of Grass said:

Why not just make an array of the words and shuffle them? That sounds easier than randomly transversing a linked list. 

http://stackoverflow.com/questions/6127503/shuffle-array-in-c

Because I wanted to give OP a solution that could be dynamic at run time. An array lends itself to being static and defined at compile time. If OP wanted to use a different phrase, they'd have to re-do the code or if it was to insert something into an array, they would be stuck with the maximum size size they gave the array.

Link to comment
Share on other sites

Link to post
Share on other sites

20 minutes ago, M.Yurizaki said:

OP would still have to write something from scratch since there's no dynamic array facilities in C.

 

Because I wanted to give OP a solution that could be dynamic at run time. An array lends itself to being static and defined at compile time. If OP wanted to use a different phrase, they'd have to re-do the code or if it was to insert something into an array, they would be stuck with the maximum size size they gave the array.

Or you simply declare a large enough static array that works for all reasonable and borderline unreasonable use cases.

i5 4670k @ 4.2GHz (Coolermaster Hyper 212 Evo); ASrock Z87 EXTREME4; 8GB Kingston HyperX Beast DDR3 RAM @ 2133MHz; Asus DirectCU GTX 560; Super Flower Golden King 550 Platinum PSU;1TB Seagate Barracuda;Corsair 200r case. 

Link to comment
Share on other sites

Link to post
Share on other sites

21 minutes ago, M.Yurizaki said:

OP would still have to write something from scratch since there's no dynamic array facilities in C.

 

1 hour ago, Yamoto42 said:

Since a string is just a char*, an array of strings is a char**.  We keep track of the current size and and how many we've put into it with a pair of integers (eg. placed 7 strings into an array that can hold 10).  Create the initial array with calloc(), expand the array when needed with realloc(), and get the individual words with strtok().

That is unless we want to be overly pedantic about stack vs. heap allocation...but with that argument comes the implication of metadata (such as length) and whether "arrays" actually exist at all in C...

Which granted, it is O(n^2) compared to a linked list being O(n), but conversely shuffling a linked list is potentially O(n^2)...but if you have an idea about the data you can choose an effective expansion size.

 Thinking about it, since the worst case scenario is every word is a single character, the most you can add to it with a single read is b/2, where b is the size of your input buffer.  With strtok being in-place, you would need a second one to manage buffers...but there are always optimizations that can be made...and by that point a list does potentially become more efficient as the individual nodes can contain additional metadata such as if they are the start of a buffer and need to be freed.

 

 

Or for best performance the entire thing can be done in O(n) but building the linked list, tracking count and buffers O(n), copying that information to an arrays O(n), then shuffling the array O(n)...:P

Link to comment
Share on other sites

Link to post
Share on other sites

36 minutes ago, Nineshadow said:

Or you simply declare a large enough static array that works for all reasonable and borderline unreasonable use cases.

Unless you know with a high degree of certainty what your maximum size is going to be, this is bad coding practice.

Link to comment
Share on other sites

Link to post
Share on other sites

1 minute ago, M.Yurizaki said:

Unless you know with a high degree of certainty what your maximum size is going to be, this is bad coding practice.

It's more than probably fine for whatever he's doing.

i5 4670k @ 4.2GHz (Coolermaster Hyper 212 Evo); ASrock Z87 EXTREME4; 8GB Kingston HyperX Beast DDR3 RAM @ 2133MHz; Asus DirectCU GTX 560; Super Flower Golden King 550 Platinum PSU;1TB Seagate Barracuda;Corsair 200r case. 

Link to comment
Share on other sites

Link to post
Share on other sites

24 minutes ago, Yamoto42 said:

-Snip-

I guess it ultimately depends on if OP needs the original list of words intact, since my original solution would've been destructive. But the doing a random traverse would avoid needing what I would think is an expensive shuffle operation. It would also be progressively faster since you're removing items from the list as you go, thereby avoiding the need to also keep track of what you already used.

1 minute ago, Nineshadow said:

It's more than probably fine for whatever he's doing.

Yeah, but it's still bad coding practice.

Link to comment
Share on other sites

Link to post
Share on other sites

21 hours ago, CJPowell27 said:

hey guys I'm trying to have the phrase "The quick brown fox jumps over the lazy dog" displayed in a random order and each word only once.

The program I'm writing is going to display a random word then ask the user to type that word, and repeat until all words are done. I'm not familiar with C but all I know is I would have to rand. could anyone give me some pointers? Any help is appreciated. 

A string in C is nothing more then a character array so you can:

 

-Traverse the string and count the words by searching whitespace between words. Replace whitespace with '\0'.

-Malloc memory to store an array of char pointers , the size being the the amount of words counted.

-Traverse the string again and store a pointer to the beginning of each word in the array.

-To display random words randomize a number between 0 and [number of words] and use it as an index into the array.

-Each time a word is used mark it as such by putting its pointer in the array to NULL, so the next randomize can test for already used words.

-Loop until all words are used.

 

Link to comment
Share on other sites

Link to post
Share on other sites

@M.Yurizaki

Building a linked list is O(n), because you can track the tail node for O(1) insertion, meaning the complexity comes entirely form reading.

 

Accessing a random element is O(rand(0, N)), or O(n) since that is the scaling factor.  You must perform n independent random accesses.  As for subtraction: while its technical O(n + (n-1) + (n-2)...2, 1, 0), SUMi=0n(i) = (n(n+1)/2), or eliminating the constant terms O(n^2).

 

With with a linked list, not shuffling is probably the preferred method, however the random access is itself a shuffle, with the exact same properties.


Building a dynamic array is O(n^2), because insertion is O(1), but reading the list is O(n).  After each item, there is the potential for expansion, which requires an O(n) copy into the new location.  This can certainly be optimized, but still technically leaves the overall complexity at O(n^2).  Building the array to support multiple buffers is the same complexity, but occurs concurrently leaving the overall complexity unchanged.

 

Shuffling is then O(n).  The factor shed is due to arrays having O(1) access times.

Ultimately the the only way to avoid O(n^2) is to do a conversion...but that is really not needed for a program of this complexity.

Overall what I would personally do is a circular list.  The only point though is freeing the nodes, as while strtok() is destructive, each buffer only has a single node pointing to its head, the rest point to offsets within it.  But managing multiple input buffers (especially when a word splits a boundary) is the true key to any arbitrary length solution.

Link to comment
Share on other sites

Link to post
Share on other sites

4 hours ago, Unimportant said:

A string in C is nothing more then a character array so you can:

 

-Traverse the string and count the words by searching whitespace between words. Replace whitespace with '\0'.

-Malloc memory to store an array of char pointers , the size being the the amount of words counted.

-Traverse the string again and store a pointer to the beginning of each word in the array.

-To display random words randomize a number between 0 and [number of words] and use it as an index into the array.

-Each time a word is used mark it as such by putting its pointer in the array to NULL, so the next randomize can test for already used words.

-Loop until all words are used.

The problem is you're randomly selecting n positions still. What you're doing here is decreasing the chances of hitting a word that wasn't used. For a sufficiently high word count n, you could be waiting a long time.

Link to comment
Share on other sites

Link to post
Share on other sites

4 hours ago, Unimportant said:

-Traverse the string and count the words by searching whitespace between words. Replace whitespace with '\0'.

-Malloc memory to store an array of char pointers , the size being the the amount of words counted.

-Traverse the string again and store a pointer to the beginning of each word in the array.

-To display random words randomize a number between 0 and [number of words] and use it as an index into the array.

-Each time a word is used mark it as such by putting its pointer in the array to NULL, so the next randomize can test for already used words.

-Loop until all words are used.

1,3:  look up strtok().

 

 

4, 5, 6:

20 minutes ago, M.Yurizaki said:

For a sufficiently high word count n, you could be waiting a long time.

Forever, even.  Potentially fivever.

Link to comment
Share on other sites

Link to post
Share on other sites

17 hours ago, M.Yurizaki said:

The problem is you're randomly selecting n positions still. What you're doing here is decreasing the chances of hitting a word that wasn't used. For a sufficiently high word count n, you could be waiting a long time.

Does it say one has to keep randomizing until hitting a unused word ? All I said was put the pointer to NULL so you can test for used words, the way to handle used words was left open. One could, for example, simply search the closest unused word when the randomizer hits a used word.

Link to comment
Share on other sites

Link to post
Share on other sites

40 minutes ago, Unimportant said:

Does it say one has to keep randomizing until hitting a unused word ? All I said was put the pointer to NULL so you can test for used words, the way to handle used words was left open. One could, for example, simply search the closest unused word when the randomizer hits a used word.

How would the search algorithm know where the closest unused word is?

Link to comment
Share on other sites

Link to post
Share on other sites

6 hours ago, Unimportant said:

Simply search the pointer array in both directions for the closest non NULL pointer?

That would be a needlessly complicated solution for the following reasons:

  • You need a way to track if the offset from your starting position is not out of bounds, in both directions.
  • You need a way to make sure you've actually traversed the entire array to exit. Simply having a condition of "word != NULL" can trap you in an infinite loop if your entire array is full of NULLs.
  • If you want to prevent that by checking the entire array is NULL first, you're doing this for every time you search, which is wasteful.

Tracking what words you've used and avoiding a ton of searching is the biggest hurdle in this problem.

Link to comment
Share on other sites

Link to post
Share on other sites

17 hours ago, M.Yurizaki said:

That would be a needlessly complicated solution for the following reasons:

  • You need a way to track if the offset from your starting position is not out of bounds, in both directions.
  • You need a way to make sure you've actually traversed the entire array to exit. Simply having a condition of "word != NULL" can trap you in an infinite loop if your entire array is full of NULLs.
  • If you want to prevent that by checking the entire array is NULL first, you're doing this for every time you search, which is wasteful.

Tracking what words you've used and avoiding a ton of searching is the biggest hurdle in this problem.

 

1) You know the size of the array obviously (=number of words), It's not hard  to check if something is out of bounds.

2-3) You know the size of the array (=number of words) and how many words you've already used, the math that follows is pretty obvious.

 

Traversing a linked list every iteration looking for the n-th word (where n is the word the randomizer chose for that iteration) is equally "wasteful". You could also traverse the array, looking for the n-th word, skipping NULL's, which would be equal to the linked list.

 

An important note to make about waste here is that waste is only important for programs that are not user limited. This program will be highly user limited as even a 30 yr old PC will traverse a dictionary in the blink of an eye. For such programs it's much better to focus on the option that is easier to code and maintain, rather then focusing on wasted cycles in a program where the bottleneck is how fast the user can type the words. (This is a general notion, not implying any of the other implementation options posted here are better or worse to code and maintain, that is for the OP to experiment and find out).

Link to comment
Share on other sites

Link to post
Share on other sites

Quote

Traversing a linked list every iteration looking for the n-th word (where n is the word the randomizer chose for that iteration) is equally "wasteful". You could also traverse the array, looking for the n-th word, skipping NULL's, which would be equal to the linked list.

With a linked list, once you use a word, you remove it from the list. That avoids the problem of checking if it's used or not because it simply doesn't exist anymore. Traversing becomes increasingly faster as you use more words. You cannot do that with an array with a mixture of valid pointers and NULLs because you still have to check if the entry is NULL or not.

Quote

An important note to make about waste here is that waste is only important for programs that are not user limited. This program will be highly user limited as even a 30 yr old PC will traverse a dictionary in the blink of an eye. For such programs it's much better to focus on the option that is easier to code and maintain, rather then focusing on wasted cycles in a program where the bottleneck is how fast the user can type the words. (This is a general notion, not implying any of the other implementation options posted here are better or worse to code and maintain, that is for the OP to experiment and find out).

If this were a serious project and OP wanted to expand on it, then using a static array has issues. Is it easier to code and read and such? Yes. Is it easier to maintain? Arguably not depending on the data size you're working with. Besides, once you have your linked list logic down, you never have to touch it again.

 

Naive approaches are fine if all you're doing this as a casual hobby, but if you're seriously getting into programming, you should expand on other solutions to put in your tool bag. I'd rather have one solution that I pull out once a blue moon, but it works extremely well for the problem, than to use a naive solution that works okay most of the time.

 

Edit: As an example, I wrote a Python script at work that would feed in a log file and spit out data into a human readable format. The naive approach was just to read the entire file and go through it. But then a few years later, I found out someone was feeding the script a 1.5GB file. This causes Python to crash. So I had to go back and re-write the script to read the file in chunks.

Link to comment
Share on other sites

Link to post
Share on other sites

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now

×