Jump to content

Find the most frequent value in a sorted vector C++

2FA

Code:

sort(most_frequent.begin(), most_frequent.end());
it = most_frequent.begin();
test = *it;
for(it = most_frequent.begin(); it != most_frequent.end(); ++it) {
    cnt++;
    if(*it == test) {
        max_cnt = cnt;
    }
    else {
        if(cnt > max_cnt) {
            max_val = test;
        }
        test = *it;
        cnt = 0;
    }
}

So I'm trying to keep track of the value that appears most frequently (max_val)  in the vector. I've tried a few different arrangements of the variables and condition statements and this is most current version. I've been trying to think about this for the past hour but I can't seem to think of the correct placement of everything.

[Out-of-date] Want to learn how to make your own custom Windows 10 image?

 

Desktop: AMD R9 3900X | ASUS ROG Strix X570-F | Radeon RX 5700 XT | EVGA GTX 1080 SC | 32GB Trident Z Neo 3600MHz | 1TB 970 EVO | 256GB 840 EVO | 960GB Corsair Force LE | EVGA G2 850W | Phanteks P400S

Laptop: Intel M-5Y10c | Intel HD Graphics | 8GB RAM | 250GB Micron SSD | Asus UX305FA

Server 01: Intel Xeon D 1541 | ASRock Rack D1541D4I-2L2T | 32GB Hynix ECC DDR4 | 4x8TB Western Digital HDDs | 32TB Raw 16TB Usable

Server 02: Intel i7 7700K | Gigabye Z170N Gaming5 | 16GB Trident Z 3200MHz

Link to comment
Share on other sites

Link to post
Share on other sites

sort(most_frequent.begin(), most_frequent.end());
auto max_val = test = *most_frequent.begin();
int max_cnt = cnt = 0;
for(auto item : most_frequent) {
    if(item == test) {
        cnt++;
    }
    else {
        if(cnt > max_cnt) {
            max_val = test;
            max_cnt = cnt;
        }
        test = item;
        cnt = 1;
    }
}

This should work, I also added a bit of C++ 11 because its nicer.

And before anyone gets angry at me for not using code tags: the mobile version of the forum doesnt have a code button.

Edited by mathijs727

Desktop: Intel i9-10850K (R9 3900X died 😢 )| MSI Z490 Tomahawk | RTX 2080 (borrowed from work) - MSI GTX 1080 | 64GB 3600MHz CL16 memory | Corsair H100i (NF-F12 fans) | Samsung 970 EVO 512GB | Intel 665p 2TB | Samsung 830 256GB| 3TB HDD | Corsair 450D | Corsair RM550x | MG279Q

Laptop: Surface Pro 7 (i5, 16GB RAM, 256GB SSD)

Console: PlayStation 4 Pro

Link to comment
Share on other sites

Link to post
Share on other sites

7 minutes ago, mathijs727 said:

And before anyone gets angry at me for not using code tags: the mobile version of the forum doesnt have a code button.

Fair, but I will get angry at you for the text color and missing the edge cases.

1474412270.2748842

Link to comment
Share on other sites

Link to post
Share on other sites

9 minutes ago, fizzlesticks said:

Fair, but I will get angry at you for the text color and missing the edge cases.

Fixed the code tags.

With missing edge case you mean it will probably crash (iterator dereference on line 2) if the list is empty?

Desktop: Intel i9-10850K (R9 3900X died 😢 )| MSI Z490 Tomahawk | RTX 2080 (borrowed from work) - MSI GTX 1080 | 64GB 3600MHz CL16 memory | Corsair H100i (NF-F12 fans) | Samsung 970 EVO 512GB | Intel 665p 2TB | Samsung 830 256GB| 3TB HDD | Corsair 450D | Corsair RM550x | MG279Q

Laptop: Surface Pro 7 (i5, 16GB RAM, 256GB SSD)

Console: PlayStation 4 Pro

Link to comment
Share on other sites

Link to post
Share on other sites

1 minute ago, mathijs727 said:

Fixed the code tags.

With missing edge case you mean it will probably crash (iterator dereference on line 2) if the list is empty?

That and it doesn't test the last value of elements in the vector, you'd need an extra check after the loop.

1474412270.2748842

Link to comment
Share on other sites

Link to post
Share on other sites

21 minutes ago, fizzlesticks said:

That and it doesn't test the last value of elements in the vector, you'd need an extra check after the loop.

Good point

Desktop: Intel i9-10850K (R9 3900X died 😢 )| MSI Z490 Tomahawk | RTX 2080 (borrowed from work) - MSI GTX 1080 | 64GB 3600MHz CL16 memory | Corsair H100i (NF-F12 fans) | Samsung 970 EVO 512GB | Intel 665p 2TB | Samsung 830 256GB| 3TB HDD | Corsair 450D | Corsair RM550x | MG279Q

Laptop: Surface Pro 7 (i5, 16GB RAM, 256GB SSD)

Console: PlayStation 4 Pro

Link to comment
Share on other sites

Link to post
Share on other sites

You can actually do this in a different way using an unordered_map. You don't have to sort for this.
It builds up a table that will give you the frequency of each value, so you can do more with it than just the maximum frequency. (For example if there are more than 1 with the biggest frequency, you can find these values, or you want the highest and lowest frequency, etc).

 

//Build a frequency table
std::unordered_map<int, int> hash_table;
for(auto &i : my_vector)
{
  hash_table[i]++;
}
//Find the value with the maximum frequency
int max_freq = -1, most_freq_val = -1; //Initialize these values to -1
for(auto &i : hash_table)
{
  if(i.second > max_freq)
  {
    max_freq = i.second;
    most_freq_val = i.first;
  }
}

 

Link to comment
Share on other sites

Link to post
Share on other sites

5 hours ago, DeadEyePsycho said:

Code:


sort(most_frequent.begin(), most_frequent.end());
it = most_frequent.begin();
test = *it;
for(it = most_frequent.begin(); it != most_frequent.end(); ++it) {
    cnt++;
    if(*it == test) {
        max_cnt = cnt;
    }
    else {
        if(cnt > max_cnt) {
            max_val = test;
        }
        test = *it;
        cnt = 0;
    }
}

So I'm trying to keep track of the value that appears most frequently (max_val)  in the vector. I've tried a few different arrangements of the variables and condition statements and this is most current version. I've been trying to think about this for the past hour but I can't seem to think of the correct placement of everything.

	//vec is the name of our vector.
	//Code assumes the vector is NOT empty (test for this first).
	//Assumes vec is sorted.

	decltype(vec)::value_type maxVal;
	int lastCount = 0; 
	for (auto it = vec.begin(); it != vec.end(); )
	{
		int count = 0;
		it = std::find_if_not(it, vec.end(), [&](decltype(vec)::value_type i){ return i == *it ? ++count : false; });
		if (count > lastCount)
		{
			lastCount = count;
			maxVal = *(it - 1);
		}
	}

 

Link to comment
Share on other sites

Link to post
Share on other sites

 

7 hours ago, fizzlesticks said:

That and it doesn't test the last value of elements in the vector, you'd need an extra check after the loop.

 

7 hours ago, mathijs727 said:

Good point

I'm pretty new to C++ and vectors, could you explain what you mean?

[Out-of-date] Want to learn how to make your own custom Windows 10 image?

 

Desktop: AMD R9 3900X | ASUS ROG Strix X570-F | Radeon RX 5700 XT | EVGA GTX 1080 SC | 32GB Trident Z Neo 3600MHz | 1TB 970 EVO | 256GB 840 EVO | 960GB Corsair Force LE | EVGA G2 850W | Phanteks P400S

Laptop: Intel M-5Y10c | Intel HD Graphics | 8GB RAM | 250GB Micron SSD | Asus UX305FA

Server 01: Intel Xeon D 1541 | ASRock Rack D1541D4I-2L2T | 32GB Hynix ECC DDR4 | 4x8TB Western Digital HDDs | 32TB Raw 16TB Usable

Server 02: Intel i7 7700K | Gigabye Z170N Gaming5 | 16GB Trident Z 3200MHz

Link to comment
Share on other sites

Link to post
Share on other sites

4 hours ago, DeadEyePsycho said:

I'm pretty new to C++ and vectors, could you explain what you mean?

In your original code and the code mathijs posted, once you get to the end of the loop you don't check the last value for having more occurrences. So for example if the vector only has 1 element, you go into the loop, add 1 to the count then exit the loop without checking if 1 >  max_cnt. To fix that you'd need to add another identical check after the loop to do the final test. Or change the design into something like Unimportants.

1474412270.2748842

Link to comment
Share on other sites

Link to post
Share on other sites

7 hours ago, fizzlesticks said:

In your original code and the code mathijs posted, once you get to the end of the loop you don't check the last value for having more occurrences. So for example if the vector only has 1 element, you go into the loop, add 1 to the count then exit the loop without checking if 1 >  max_cnt. To fix that you'd need to add another identical check after the loop to do the final test. Or change the design into something like Unimportants.

 

With his code, I get two errors.

 

main.cpp|68|error: expected primary-expression before 'auto'

main.cpp|68|error: expected ')' before 'auto'

 

I can't spot anything obvious that's wrong and autos are not something I normally use. This specific line is the for loop conditions.

[Out-of-date] Want to learn how to make your own custom Windows 10 image?

 

Desktop: AMD R9 3900X | ASUS ROG Strix X570-F | Radeon RX 5700 XT | EVGA GTX 1080 SC | 32GB Trident Z Neo 3600MHz | 1TB 970 EVO | 256GB 840 EVO | 960GB Corsair Force LE | EVGA G2 850W | Phanteks P400S

Laptop: Intel M-5Y10c | Intel HD Graphics | 8GB RAM | 250GB Micron SSD | Asus UX305FA

Server 01: Intel Xeon D 1541 | ASRock Rack D1541D4I-2L2T | 32GB Hynix ECC DDR4 | 4x8TB Western Digital HDDs | 32TB Raw 16TB Usable

Server 02: Intel i7 7700K | Gigabye Z170N Gaming5 | 16GB Trident Z 3200MHz

Link to comment
Share on other sites

Link to post
Share on other sites

12 minutes ago, DeadEyePsycho said:

With his code, I get two errors.

 

main.cpp|68|error: expected primary-expression before 'auto'

main.cpp|68|error: expected ')' before 'auto'

 

I can't spot anything obvious that's wrong and autos are not something I normally use. This specific line is the for loop conditions.

He's using c++14(11?) features, if using g++ or Clang you'll need to compile with the -std=c++14 flag.

1474412270.2748842

Link to comment
Share on other sites

Link to post
Share on other sites

5 hours ago, fizzlesticks said:

He's using c++14(11?) features, if using g++ or Clang you'll need to compile with the -std=c++14 flag.

Plain c++11.

Link to comment
Share on other sites

Link to post
Share on other sites

5 hours ago, DeadEyePsycho said:

With his code, I get two errors.

 

main.cpp|68|error: expected primary-expression before 'auto'

main.cpp|68|error: expected ')' before 'auto'

 

I can't spot anything obvious that's wrong and autos are not something I normally use. This specific line is the for loop conditions.

In addition to enabling c++11, like @fizzlesticks said, make sure to include header <algorithm> for std::find_if_not.

 

decltype(vec)::value_type x; //...

...defines variable x with the vector template type(*), so you could get rid of that and hardcode in the type, if your compiler gives you trouble with that (it shouldn't).

 

(* It makes the code more generic, being able to work with any vector, whatever it's template type is, as long as the template type supports the equality operator).

Link to comment
Share on other sites

Link to post
Share on other sites

It makes me smile when i see the big three all in one thread. Thanks for always educating me (to unimportant, fizzlesticks and mathijis727, i'm sure there are others but i always see these three).

Link to comment
Share on other sites

Link to post
Share on other sites

On 4/14/2017 at 0:33 AM, fizzlesticks said:

He's using c++14(11?) features, if using g++ or Clang you'll need to compile with the -std=c++14 flag.

I didn't get a chance to work on this until now.

 

For some reason, my compiler (g++) absolutely refuses to accept "auto item". Actually, it refuses to work with "auto" regardless of what standard I'm using.

[Out-of-date] Want to learn how to make your own custom Windows 10 image?

 

Desktop: AMD R9 3900X | ASUS ROG Strix X570-F | Radeon RX 5700 XT | EVGA GTX 1080 SC | 32GB Trident Z Neo 3600MHz | 1TB 970 EVO | 256GB 840 EVO | 960GB Corsair Force LE | EVGA G2 850W | Phanteks P400S

Laptop: Intel M-5Y10c | Intel HD Graphics | 8GB RAM | 250GB Micron SSD | Asus UX305FA

Server 01: Intel Xeon D 1541 | ASRock Rack D1541D4I-2L2T | 32GB Hynix ECC DDR4 | 4x8TB Western Digital HDDs | 32TB Raw 16TB Usable

Server 02: Intel i7 7700K | Gigabye Z170N Gaming5 | 16GB Trident Z 3200MHz

Link to comment
Share on other sites

Link to post
Share on other sites

Actually, I got it working. I ended up using my loop conditions with @mathijs727's if/else statements. Thanks everyone.

[Out-of-date] Want to learn how to make your own custom Windows 10 image?

 

Desktop: AMD R9 3900X | ASUS ROG Strix X570-F | Radeon RX 5700 XT | EVGA GTX 1080 SC | 32GB Trident Z Neo 3600MHz | 1TB 970 EVO | 256GB 840 EVO | 960GB Corsair Force LE | EVGA G2 850W | Phanteks P400S

Laptop: Intel M-5Y10c | Intel HD Graphics | 8GB RAM | 250GB Micron SSD | Asus UX305FA

Server 01: Intel Xeon D 1541 | ASRock Rack D1541D4I-2L2T | 32GB Hynix ECC DDR4 | 4x8TB Western Digital HDDs | 32TB Raw 16TB Usable

Server 02: Intel i7 7700K | Gigabye Z170N Gaming5 | 16GB Trident Z 3200MHz

Link to comment
Share on other sites

Link to post
Share on other sites

2 hours ago, DeadEyePsycho said:

I didn't get a chance to work on this until now.

 

For some reason, my compiler (g++) absolutely refuses to accept "auto item". Actually, it refuses to work with "auto" regardless of what standard I'm using.

If you could post the offending code (with some context/surrounding code) we can take a look at it. auto should work, unless your g++ predates c++11 (unlikely).

Link to comment
Share on other sites

Link to post
Share on other sites

2 hours ago, Unimportant said:

If you could post the offending code (with some context/surrounding code) we can take a look at it. auto should work, unless your g++ predates c++11 (unlikely).

 

It's quite literally the code in this post except with an extra check just after the for loop.

sort(most_frequent.begin(), most_frequent.end());
auto max_val = test = *most_frequent.begin();
int max_cnt = cnt = 0;
for(auto item : most_frequent) {
    if(item == test) {
        cnt++;
    }
    else {
        if(cnt > max_cnt) {
            max_val = test;
            max_cnt = cnt;
        }
        test = item;
        cnt = 1;
    }
}
if(item == test) {
        cnt++;
    }
else {
	if(cnt > max_cnt) {
		max_val = test;
        max_cnt = cnt;
    }
    test = item;
    cnt = 1;
}

I did some testing just now and I realized what was causing the issue. That is 'item' only exists in the scope of the for loop which means I would need to set some other variable to value of 'item' when the loop ends to use in place of 'item'. 

[Out-of-date] Want to learn how to make your own custom Windows 10 image?

 

Desktop: AMD R9 3900X | ASUS ROG Strix X570-F | Radeon RX 5700 XT | EVGA GTX 1080 SC | 32GB Trident Z Neo 3600MHz | 1TB 970 EVO | 256GB 840 EVO | 960GB Corsair Force LE | EVGA G2 850W | Phanteks P400S

Laptop: Intel M-5Y10c | Intel HD Graphics | 8GB RAM | 250GB Micron SSD | Asus UX305FA

Server 01: Intel Xeon D 1541 | ASRock Rack D1541D4I-2L2T | 32GB Hynix ECC DDR4 | 4x8TB Western Digital HDDs | 32TB Raw 16TB Usable

Server 02: Intel i7 7700K | Gigabye Z170N Gaming5 | 16GB Trident Z 3200MHz

Link to comment
Share on other sites

Link to post
Share on other sites

8 minutes ago, DeadEyePsycho said:

I did some testing just now and I realized what was causing the issue. That is 'item' only exists in the scope of the for loop which means I would need to set some other variable to value of 'item' when the loop ends to use in place of 'item'. 

You don't need to check the item, only the count and you can get the item by using vector.back().

1474412270.2748842

Link to comment
Share on other sites

Link to post
Share on other sites

1 minute ago, fizzlesticks said:

You don't need to check the item, only the count and you get get the item by using vector.back().

So just this part?

if(cnt > max_cnt) {
		max_val = test;
        max_cnt = cnt;
}

Also, I don't why it's adding whitespace on some of the lines. It looks fine in the insert code editor.

[Out-of-date] Want to learn how to make your own custom Windows 10 image?

 

Desktop: AMD R9 3900X | ASUS ROG Strix X570-F | Radeon RX 5700 XT | EVGA GTX 1080 SC | 32GB Trident Z Neo 3600MHz | 1TB 970 EVO | 256GB 840 EVO | 960GB Corsair Force LE | EVGA G2 850W | Phanteks P400S

Laptop: Intel M-5Y10c | Intel HD Graphics | 8GB RAM | 250GB Micron SSD | Asus UX305FA

Server 01: Intel Xeon D 1541 | ASRock Rack D1541D4I-2L2T | 32GB Hynix ECC DDR4 | 4x8TB Western Digital HDDs | 32TB Raw 16TB Usable

Server 02: Intel i7 7700K | Gigabye Z170N Gaming5 | 16GB Trident Z 3200MHz

Link to comment
Share on other sites

Link to post
Share on other sites

1 minute ago, DeadEyePsycho said:

So just this part?

Yes, the rest is already being done inside the loop.

1474412270.2748842

Link to comment
Share on other sites

Link to post
Share on other sites

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now

×