Jump to content

Yet another algorithmic problem...

Nineshadow

Given N natural numbers and another natural number K ( N and K <= 1.000.000 ) , eliminate a sequence of minimum length such that every number from the remaining series appears a maximum of K times. Print the length of such a sequence.

 

Example :

Input :

6 1
3 3 3 2 1 2

Output :

3

The eliminated sequence is : (3,3,2) .

_____________________________________________________

 

Of course , it needs to be as efficient as possible, and my current idea doesn't really fit in the execution time limit (0.3 sec) .

i5 4670k @ 4.2GHz (Coolermaster Hyper 212 Evo); ASrock Z87 EXTREME4; 8GB Kingston HyperX Beast DDR3 RAM @ 2133MHz; Asus DirectCU GTX 560; Super Flower Golden King 550 Platinum PSU;1TB Seagate Barracuda;Corsair 200r case. 

Link to comment
Share on other sites

Link to post
Share on other sites

Could you please rephrase that question? Also, what's your solution? What did you write it in? How long did it execute? What did you write it in?

Link to comment
Share on other sites

Link to post
Share on other sites

Just now, Gachr said:

Could you please rephrase that question? Also, what's your solution? What did you write it in? How long did it execute? What did you write it in?

Pretty much bruteforce. Check all possible sequences , check the condition for the remaining series and then get the minimum of the lengths of these sequences. It was the first thing that came into my mind and I was pretty sure that it was a bad idea since there can be up to1 million numbers in the series. And it was.

 

Let's say N = 6 and K = 1 , and the numbers are {3,3,3,2,1,2} .  We want each number to only appear up to K times , and since K is 1 , we want them to appear only once. But we have 3 appearances of the number 3, and 2 appearances of the number 2. We need to remove a sequence so that we end up with only 1 number of 3 and 1 number of 2. What we have to do is get the minimum length of a sequence which can do such a thing. It would look something like this (the numbers in bold are part of the sequence which we remove)

3 , 3 , 3 , 2 , 1 , 2

3 , 1 , 2

i5 4670k @ 4.2GHz (Coolermaster Hyper 212 Evo); ASrock Z87 EXTREME4; 8GB Kingston HyperX Beast DDR3 RAM @ 2133MHz; Asus DirectCU GTX 560; Super Flower Golden King 550 Platinum PSU;1TB Seagate Barracuda;Corsair 200r case. 

Link to comment
Share on other sites

Link to post
Share on other sites

Not sure if I understand it yet, but here's a shot. I assume that you're doing this for C++. I don't know anything about C++, but I would do it in Java by using a HashMap<Integer, Integer>. A map is like a list of mappings, where the first half of the mapping is the "key" and the second half of the mapping is an assigned "value." I think you could simulate a map by using two arrays (or two vectors in C++, I believe), but since a map exists in C++ I'll just use the pseudocode for a map. 

 

In this case, the first integer (the key) is a number in the sequence and the second integer (the value) is how many times that number shows up. With this method, you only need to go through the sequence once. 

 

Here's some pseudocode for the process:

You have three variables:
an empty map
the original sequence
an integer removedCounter

You read the first number in the sequence, which is 3.
If the map does not already have the key 3, then {
	Add the mapping (3, 1) to the map. //This means that the number 3 has now shown up in the original sequence 1 time.
} else if the map already has the key 3, then {
	If the value mapped to the key 3 is less than k, then {
		Replace the mapping with (3, originalValue + 1) //This increments the number of times 3 has shown up by one. 
	} else if the value mapped to the key 3 is more than k, then {
		Add 1 to your removedCounter. Go on to the next number in the sequence. 
	}
}
Print removedCounter.

 

This doesn't actually remove anything from the sequence. If you're required to have that in code, then you could easily add in a vector of removed numbers. 

 

edit: a Java version of the program with 6 integers and a k value of 1 takes less than a millisecond to run. The Java version with 1 million integers and a k value of 1 takes 320 - 360 milliseconds to run in IntelliJ Idea on a Core i3-4000M. With a k value of 1 million, it takes 800 - 900 milliseconds to run. C++ should be faster than Java, and you're likely using a more powerful computer than mine. I don't have time to implement a two-vector version instead of the map version, but you might squeeze some extra speed out of it if you use the two-vector version. 

Link to comment
Share on other sites

Link to post
Share on other sites

35 minutes ago, Philosobyte said:

-snip

The thing is , you must remove a contiguous sequence from the original series.

Example :

If you had  3 3 1 2 2 and each number had to appear at most once , then you'd have to remove { 3 , 1 , 2 } . ( 3, 3, 1 , 2, 2 )

 

I actually did the same thing as you did the first time I tried it (well , kind of, I didn't use hash maps since they are overkill in this case , I just used a characteristic/indicator vector) , but it didn't output the correct results .

i5 4670k @ 4.2GHz (Coolermaster Hyper 212 Evo); ASrock Z87 EXTREME4; 8GB Kingston HyperX Beast DDR3 RAM @ 2133MHz; Asus DirectCU GTX 560; Super Flower Golden King 550 Platinum PSU;1TB Seagate Barracuda;Corsair 200r case. 

Link to comment
Share on other sites

Link to post
Share on other sites

11 minutes ago, Nineshadow said:

The thing is , you must remove a contiguous sequence from the original series.

Example :

If you had  3 3 1 2 2 and each number had to appear at most once , then you'd have to remove { 3 , 1 , 2 } . ( 3, 3, 1 , 2, 2 )

I see. In my pseudocode, then, the removedCounter would be extraneous. 

 

We would have to use two extra integer variables. One integer variable defines the index of the first time k is exceeded - that variable will only be assigned once. A second integer variable defines the index of the last time k is exceeded - that variable will be assigned each time k is exceeded. By the time you reach the end of the sequence, your subsequence will be between those two integers (or plus or minus one depending on how indexing works in C++). 

 

How does this sound?

Link to comment
Share on other sites

Link to post
Share on other sites

4 minutes ago, Philosobyte said:

I see. In my pseudocode, then, the removedCounter would be extraneous. 

 

We would have to use two extra integer variables. One integer variable defines the index of the first time k is exceeded - that variable will only be assigned once. A second integer variable defines the index of the last time k is exceeded - that variable will be assigned each time k is exceeded. By the time you reach the end of the sequence, your subsequence will be between those two integers (or plus or minus one depending on how indexing works in C++). 

 

How does this sound?

Won't work. Take the original example. 3 3 3 2 1 2 . The first integer în excess is at index 1. The last integer în excess is at index 5. Nevertheless, I think I have an idea going on. We know that the lenght must be , at its minimum, the number of integers with more than k appearances. Then there's that plus the number of integers required to make all of those în a contiguous sequence. It' late now and I'm going to sleep. I'll try to implement this tonorrow.

i5 4670k @ 4.2GHz (Coolermaster Hyper 212 Evo); ASrock Z87 EXTREME4; 8GB Kingston HyperX Beast DDR3 RAM @ 2133MHz; Asus DirectCU GTX 560; Super Flower Golden King 550 Platinum PSU;1TB Seagate Barracuda;Corsair 200r case. 

Link to comment
Share on other sites

Link to post
Share on other sites

2 minutes ago, Nineshadow said:

Won't work. Take the original example. 3 3 3 2 1 2 . The first integer în excess is at index 1. The last integer în excess is at index 5. Nevertheless, I think I have an idea going on. We know that the lenght must be , at its minimum, the number of integers with more than k appearances. Then there's that plus the number of integers required to make all of those în a contiguous sequence. It' late now and I'm going to sleep. I'll try to implement this tonorrow.

Ah, so instead of the last time k is exceeded, it must be the second-to-last time k is exceeded. Sorry for the oversight.

 

Link to comment
Share on other sites

Link to post
Share on other sites

I came up with a solution that uses a sliding window of deletable sequences but the speed is highly dependent on the maximum possible value in the input. If the input ranges from 0 to 500ish, the performance is decent but any higher than that and it gets pretty slow. This could be fixed by using a different data structure that would allow fast random access and fast max_element instead of a vector (does something like that exist?)

 

My solution iterates through the input and for each element finds the shortest possible sequence that must be deleted if you started deleting at that point. The length of the sequence starting at each element is at least the length of the previous sequence - 1.

 

 

Spoiler

#include <iostream>
#include <vector>
#include <algorithm>
#include <random>

using namespace std;

int main()
{
    int MAX_VAL = 250;
    
    vector<int> v(1000000);
    int K = 11500;

    auto dist = uniform_int_distribution<int>(0,MAX_VAL);
    mt19937 ran(random_device{}());
    generate(begin(v), end(v), [&dist, &ran](){ return dist(ran); });
    
    /*
    vector<int> v{1,1,1,1,2,1,1,2};
    int K = 2;
    */

    int m = *max_element(begin(v),end(v));
    vector<int> counts(m+1, 0);
    for(auto it=begin(v); it != end(v); ++it)
    {
        if(counts[*it] == 0)
        {
            counts[*it] = count(it, end(v), *it);
        }
    }

    int minRun = 999999999;
    auto minStart = begin(v);
    int lastRun = 1;
    for (auto it = begin(v); it != end(v); ++it)
    {
        if(lastRun == 0)
        {
            break;
        }
        auto startPoint = it;
        int runLen = lastRun - 1;
        if(it != begin(v))
        {
            ++counts[*(it - 1)];
        }

        auto mIt = max_element(begin(counts), end(counts));
        while(*mIt > K)
        {
            if(it + runLen == end(v))
            {
                break;
            }
            int n = *(it + runLen);
            --counts[n];
            ++runLen;
            if(distance(begin(counts), mIt) == n)
            {
                mIt = max_element(begin(counts), end(counts));
            }
        }

        if(it + runLen == end(v))
        {
            break;
        }
        else
        {
            if(runLen < minRun)
            {
                minRun = runLen;
                minStart = startPoint;
            }
        }
        lastRun = runLen;
    }

    cout << minRun << endl;
    cout << "Starting at position: " << distance(begin(v), minStart) << endl;
    return 0;
}

 

 

1474412270.2748842

Link to comment
Share on other sites

Link to post
Share on other sites

You see, I've given you several wrong suggestions now. But each time I do, I get closer to the solution. I shouldn't be so confident in my suggestions for future purposes, but this time I'm confident because this time I actually wrote the program. The last index of your subsequence will be not the second-to-last-time k is exceeded, but rather either the second-to-last-time k is exceeded or the last time k is met, depending on which comes later/larger.

 

I also ditched the map and used a single array. Each index in the array corresponds to a possible natural number in the sequence, and each value in the array corresponds to that natural number's frequency, or the number of times the natural number has shown up in the sequence. 

This way, the program will not need to search for a value - it already knows where the value is in memory. This saves a ton of time. 

 

I've created working code in Java which runs N <= 1,000,000 and k <= 1,000,000, where each number in the sequence <= 1,000,000 (You could expand this value up to a limit. I have no idea about the maximum size of an array in C++). Here's the Java code, which will have to act as pseudocode. It runs in under 50 milliseconds. I also tested it with your example of 3, 3, 1, 2, 2, and it gave the correct output in 4 milliseconds (long initial start-up time due to array initialization).

 

 

public class Test {

    public static void main(String[] args) {
        int n = 1000000;
        int[] sequence = new int[n];

        for (int i = 0; i < sequence.length; i++) {
            sequence[i] = (int)(Math.random() * 1000000);
        }
        long time = System.currentTimeMillis();
        int k = 5;
        int[] array = new int[1000001];
        int firstIndex = 0;
        int lastIndex = 0;
        int lastMetOrSecondLastExceeded = 0;

        for (int i = 0; i < sequence.length; i++) {
            int number = sequence[i];
            int frequency = array[number];

            if (frequency < k) {
                frequency++;
                array[number] = frequency;
                if(frequency == k) {
                    lastMetOrSecondLastExceeded = i;
                }
                continue;
            }
            if (firstIndex == 0) {
                firstIndex = i;
                continue;
            }
            if (lastMetOrSecondLastExceeded < lastIndex) {
                lastMetOrSecondLastExceeded = lastIndex;
            }
            lastIndex = i;
        }
        System.out.println(lastMetOrSecondLastExceeded - firstIndex + 1);
        System.out.println(System.currentTimeMillis() - time);
    }
}

 

 

Link to comment
Share on other sites

Link to post
Share on other sites

2 hours ago, fizzlesticks said:

-snip-

 

Yup , that's pretty the same idea I used. Except without STL , because...oh well , no idea why honestly.

 

What about this:

 

The required length will always be at least equal to the numbers of element in excess.

Let's say we have { 3, 3, 3, 2, 1, 2 }. We have two of '3' in excess and one '2' in excess. The sequence which we delete must always have a length of at least 3. It's a pretty obvious observation . Now , in this example , there are 3 elements with the property we want in a contiguous sequence , so the minimum length which we need to output will be equal to the numbers of elements in excess.

 

But let's say we have { 3 , 3 , 3 , 1 , 2 , 2 }. Yet again , the bare-minimum of any sequence which we want to delete will be 3. But in this case , there are no 3 consecutive elements with the proprieties we want. Let's highlight with bold the elements which we need to remove : { 3 , 3 , 3 , 1 , 2 , 2 } . We need to make out of these a contiguous sequence , so we also remove the 1 , and we end up with :  { 3 , 3 , 3 , 1 , 2 , 2 } .

 

So it the problem will resume itself to another problem of making out of a few different sequences a contiguous sequence and finding the amount of elements we need to change in order for that to happen. Except I don't really see a good way of solving this one. In the first place , how do we select which elements from the original array we want to remove?

Considering { 3 , 3 , 3 , 1 , 2 , 2 }  , it could just as easily have been { 3 , 3 , 3 , 1 , 2 , 2 } . But what stops it from choosing something like { 3 , 3 , 3 , 1 , 2 , 2 } ?

 

57 minutes ago, Philosobyte said:

-snip-

 

Actually no , you don't get closer to the solution . It's easy to make an observation for a given example , but it's way harder to make a generic observation based on any arbitrary input. I don't see any specific reason for why things must work the way you presented them , other than that's how it works for the 2-3 examples we've had. But then , I can easily come up with an example that doesn't fit your solution.

 

And , indeed , it's not the way things work. After putting your source to the tests, it only got 1 of them right (out of 10). The others didn't have the correct result

i5 4670k @ 4.2GHz (Coolermaster Hyper 212 Evo); ASrock Z87 EXTREME4; 8GB Kingston HyperX Beast DDR3 RAM @ 2133MHz; Asus DirectCU GTX 560; Super Flower Golden King 550 Platinum PSU;1TB Seagate Barracuda;Corsair 200r case. 

Link to comment
Share on other sites

Link to post
Share on other sites

Alrighty, then. Please do me a favor and sate my curiosity by PMing me all the sequences so I can test myself. 

Thank you. 

Link to comment
Share on other sites

Link to post
Share on other sites

Didn't do too much testing but assuming I didn't break anything, this should be fast enough (maybe).

I added a second vector of pointers to keep track of where elements are while the main vector is sorted by each elements count which solves the fast random access + fast max_element I was missing before.

 

Spoiler

#include <iostream>
#include <vector>
#include <algorithm>
#include <random>
#include <map>
#include <chrono>

using std::vector;
using std::sort;
using std::begin;
using std::end;

struct mpair;
vector<mpair*> pointers;

struct mpair
{
    int first;
    int second;

    mpair() : first(0), second(0) {}

    bool operator<(const mpair& o)
    {
        if (second == o.second)
        {
            return first < o.first;
        }
        else
        {
            return second < o.second;
        }
    }
};

void swap(mpair& p1, mpair& p2)
{
    std::swap(pointers[p1.first], pointers[p2.first]);
    std::swap(p1.first, p2.first);
    std::swap(p1.second, p2.second);
}

struct mpairCmp
{
    bool operator()(const mpair& p1, const mpair& p2)
    {
        if (p1.second == p2.second)
        {
            return p1.first < p2.first;
        }
        else
        {
            return !(p1.second < p2.second);
        }
    }
};


int main()
{
    int MAX_VAL = 100;
    
    vector<int> v(1000000);
    int K = 200;
    
    auto dist = std::uniform_int_distribution<int>(0,MAX_VAL);
    std::mt19937 ran(std::random_device{}());
    generate(begin(v), end(v), [&dist, &ran](){ return dist(ran); });
    
    /*
    vector<int> v{ 4,3,3,3,2,2,1,3,3,1,2 };
    int K = 3;*/

    auto startTime = std::chrono::high_resolution_clock::now();
    int m = *max_element(begin(v),end(v));
    vector<mpair> counts(m+1);
    pointers.resize(m+1);

    for(int i=0; i < m+1; ++i)
    {
        counts[i].first = i;
    }
    for(int i=0; i < v.size(); ++i)
    {
        ++counts[v[i]].second;
    }

    sort(begin(counts), end(counts), mpairCmp());
    for(auto& p : counts)
    {
        pointers[p.first] = &p;
    }

    int minRun = 999999999;
    auto minStart = begin(v);
    int lastRun = 1;
    for (auto it = begin(v); it != end(v); ++it)
    {
        if(lastRun == 0)
        {
            break;
        }
        auto startPoint = it;
        int runLen = lastRun - 1;
        if(it != begin(v))
        {
            int n = *(it - 1);
            ++pointers[n]->second;
            int cur = pointers[n] - &counts[0];
            while(cur > 0 && !(counts[cur] < counts[cur-1]))
            {
                swap(counts[cur], counts[cur-1]);
                --cur;
            }
        }

        while(counts[0].second > K)
        {
            if(it + runLen == end(v))
            {
                break;
            }
            int n = *(it + runLen);
            --pointers[n]->second;
            int cur = pointers[n] - &counts[0];
            while(cur < counts.size() - 1 && counts[cur] < counts[cur+1])
            {
                swap(counts[cur], counts[cur+1]);
                ++cur;
            }

            ++runLen;
        }

        if(it + runLen == end(v))
        {
            break;
        }
        else
        {
            if(runLen < minRun)
            {
                minRun = runLen;
                minStart = startPoint;
            }
        }
        lastRun = runLen;
    }
    
    std::cout << minRun << std::endl;
    std::cout << "Starting at position: " << std::distance(begin(v), minStart) << std::endl;
    auto diff = std::chrono::high_resolution_clock::now() - startTime;
    std::cout << "Time: " << std::chrono::duration_cast<std::chrono::milliseconds>(diff).count() << std::endl;
    return 0;
}

 

 

1474412270.2748842

Link to comment
Share on other sites

Link to post
Share on other sites

8 hours ago, Philosobyte said:

Alrighty, then. Please do me a favor and sate my curiosity by PMing me all the sequences so I can test myself. 

Thank you. 

http://www.infoarena.ro/problema/maxk You need to make an account to send a source file ( C/C++, Java or Pascal iirc) and it will evaluate it.

i5 4670k @ 4.2GHz (Coolermaster Hyper 212 Evo); ASrock Z87 EXTREME4; 8GB Kingston HyperX Beast DDR3 RAM @ 2133MHz; Asus DirectCU GTX 560; Super Flower Golden King 550 Platinum PSU;1TB Seagate Barracuda;Corsair 200r case. 

Link to comment
Share on other sites

Link to post
Share on other sites

@fizzlesticks
Yeah , that definitely improved the execution time. It's actually a pretty nice idea. Anyway , it's ALMOST fast enough. It passes 9/10 tests. D:

i5 4670k @ 4.2GHz (Coolermaster Hyper 212 Evo); ASrock Z87 EXTREME4; 8GB Kingston HyperX Beast DDR3 RAM @ 2133MHz; Asus DirectCU GTX 560; Super Flower Golden King 550 Platinum PSU;1TB Seagate Barracuda;Corsair 200r case. 

Link to comment
Share on other sites

Link to post
Share on other sites

7 hours ago, Nineshadow said:

 It passes 9/10 tests. D:

Ah those damn Russian judges. 

It could be made a bit faster by changing those bubble sorts to more of an insertion sort. Don't know if that would make a big enough difference.

 

edit: ok I was wrong, that increased the speed by a ton.

 

Spoiler

#include <iostream>
#include <vector>
#include <algorithm>
#include <random>
#include <map>
#include <chrono>

using std::vector;
using std::sort;
using std::begin;
using std::end;

struct mpair;
vector<mpair*> pointers;

struct mpair
{
    int first;
    int second;

    mpair() : first(0), second(0) {}

    bool operator<(const mpair& o)
    {
        if (second == o.second)
        {
            return first < o.first;
        }
        else
        {
            return second < o.second;
        }
    }
};

void swap(mpair& p1, mpair& p2)
{
    std::swap(pointers[p1.first], pointers[p2.first]);
    std::swap(p1.first, p2.first);
    std::swap(p1.second, p2.second);
}

struct mpairCmp
{
    bool operator()(const mpair& p1, const mpair& p2)
    {
        if (p1.second == p2.second)
        {
            return p1.first < p2.first;
        }
        else
        {
            return !(p1.second < p2.second);
        }
    }
};


int main()
{
    auto startTime = std::chrono::high_resolution_clock::now();
    int MAX_VAL = 100000;
    
    vector<int> v(1000000);
    int K = 15;
    
    auto dist = std::uniform_int_distribution<int>(0,MAX_VAL);
    std::mt19937 ran(std::random_device{}());
    generate(begin(v), end(v), [&dist, &ran](){ return dist(ran); });
    
    /*
    vector<int> v{ 4,3,3,3,2,2,1,3,3,1,2 };
    int K = 1;
    */
    int m = *max_element(begin(v),end(v));
    vector<mpair> counts(m+1);
    pointers.resize(m+1);

    for(int i=0; i < m+1; ++i)
    {
        counts[i].first = i;
    }
    for(int i=0; i < v.size(); ++i)
    {
        ++counts[v[i]].second;
    }

    sort(begin(counts), end(counts), mpairCmp());
    for(auto& p : counts)
    {
        pointers[p.first] = &p;
    }

    int minRun = 999999999;
    auto minStart = begin(v);
    int lastRun = 1;
    for (auto it = begin(v); it != end(v); ++it)
    {
        if(lastRun == 0)
        {
            break;
        }
        auto startPoint = it;
        int runLen = lastRun - 1;
        if(it != begin(v))
        {
            int n = *(it - 1);
            ++pointers[n]->second;
            int cur = pointers[n] - &counts[0];
            int pos = cur;
            /*while(cur > 0 && !(counts[cur] < counts[cur-1]))
            {
                swap(counts[cur], counts[cur-1]);
                --cur;
            }*/
            bool sw = false;
            while(cur > 0 && !(counts[pos] < counts[cur-1]))
            {
                --cur;
                sw = true;
            }
            if(sw)
            {
                swap(counts[pos], counts[cur]);
            }
        }

        while(counts[0].second > K)
        {
            if(it + runLen == end(v))
            {
                break;
            }
            int n = *(it + runLen);
            --pointers[n]->second;
            int cur = pointers[n] - &counts[0];
            int pos = cur;
            /*
            while(cur < counts.size() - 1 && counts[cur] < counts[cur+1])
            {
                swap(counts[cur], counts[cur+1]);
                ++cur;
            }
            */
            bool sw = false;
            while (cur < counts.size() - 1 && counts[pos] < counts[cur+1])
            {
                ++cur;
                sw = true;
            }
            if (sw)
            {
                swap(counts[pos], counts[cur]);
            }
            ++runLen;
        }

        if(it + runLen == end(v))
        {
            break;
        }
        else
        {
            if(runLen < minRun)
            {
                minRun = runLen;
                minStart = startPoint;
            }
        }
        lastRun = runLen;
    }
    
    std::cout << minRun << std::endl;
    std::cout << "Starting at position: " << std::distance(begin(v), minStart) << std::endl;
    auto diff = std::chrono::high_resolution_clock::now() - startTime;
    std::cout << "Time: " << std::chrono::duration_cast<std::chrono::milliseconds>(diff).count() << std::endl;
    return 0;
}

 

 

1474412270.2748842

Link to comment
Share on other sites

Link to post
Share on other sites

On 3/1/2016 at 8:17 PM, fizzlesticks said:

Ah those damn Russian judges. 

It could be made a bit faster by changing those bubble sorts to more of an insertion sort. Don't know if that would make a big enough difference.

 

edit: ok I was wrong, that increased the speed by a ton.

 

  Hide contents

 

 

I'm sorry to tell you but it's still not fast enough.

 

"damn Russian judges"

i5 4670k @ 4.2GHz (Coolermaster Hyper 212 Evo); ASrock Z87 EXTREME4; 8GB Kingston HyperX Beast DDR3 RAM @ 2133MHz; Asus DirectCU GTX 560; Super Flower Golden King 550 Platinum PSU;1TB Seagate Barracuda;Corsair 200r case. 

Link to comment
Share on other sites

Link to post
Share on other sites

I see why my original idea wouldn't work. I really should have seen it a long time ago -_- 

 

However, I can see how it can be of use, before you go into the main algorithm. What if you iterate through the sequence until you get to the first number that exceeds k, and then iterate backwards from the end of the sequence until we get to the first number that exceeds k from that direction, and remove all the numbers in between them (inclusive)? This procedure can be carried out either none or multiple times, until the index from the reverse direction is smaller than the index from the forward direction. After this procedure is done, then you can continue on with the algorithm you've already implemented.

 

I can't see how this would result in noncontiguous sequences or suboptimal sequences being removed - theoretically any sequences you remove after this initial procedure will always be contiguous with the sequences you remove with your other algorithm. 

 

What do you think? 

Link to comment
Share on other sites

Link to post
Share on other sites

6 hours ago, Nineshadow said:

-snip-

Was able to get 10/10 with this code.

Spoiler

#define _CRT_SECURE_NO_WARNINGS
#include <iostream>
#include <vector>
#include <algorithm>
#include <chrono>
#include <fstream>
#include <sstream>
 
using std::vector;
using std::sort;
using std::begin;
using std::end;
 
struct mpair;
vector<mpair*> pointers;
 
struct mpair
{
    int first;
    int second;
 
    mpair() : first(0), second(0) {}
 
    bool operator<(const mpair& o)
    {
        if (second == o.second)
        {
            return first < o.first;
        }
        else
        {
            return second < o.second;
        }
    }
};
 
void swap(mpair& p1, mpair& p2)
{
    std::swap(pointers[p1.first], pointers[p2.first]);
    std::swap(p1.first, p2.first);
    std::swap(p1.second, p2.second);
}
 
struct mpairCmp
{
    bool operator()(const mpair& p1, const mpair& p2)
    {
        if (p1.second == p2.second)
        {
            return p1.first < p2.first;
        }
        else
        {
            return !(p1.second < p2.second);
        }
    }
};
 
int fast_atoi(const char * str)
{
    int val = 0;
    while (*str) {
        val = val * 10 + (*str++ - '0');
    }
    return val;
}
 
int main()
{
    auto startTime = std::chrono::high_resolution_clock::now();
 
    std::ifstream f("maxk.in");
    std::string s;
    std::getline(f, s);
    std::stringstream ss(s);
 
    int N, K;
    ss >> N;
    ss >> K;
 
    vector<int> v(N);
 
    int c = 0;
    while(c < N)
    {
        std::getline(f, s, ' ');
        v[c++] = fast_atoi(s.c_str());
    }
 
    f.close();
 
    int m = *max_element(begin(v),end(v));
    vector<mpair> counts(m+1);
    pointers.resize(m+1);
 
    for(int i=0; i < m+1; ++i)
    {
        counts[i].first = i;
    }
    for(int i=0; i < v.size(); ++i)
    {
        ++counts[v[i]].second;
    }
 
    sort(begin(counts), end(counts), mpairCmp());
    for(auto& p : counts)
    {
        pointers[p.first] = &p;
    }
 
    int minRun = 999999999;
    auto minStart = begin(v);
    int lastRun = 1;
    for (auto it = begin(v); it != end(v); ++it)
    {
        if(lastRun == 0)
        {
            break;
        }
        auto startPoint = it;
        int runLen = lastRun - 1;
        if(it != begin(v))
        {
            int n = *(it - 1);
            ++pointers[n]->second;
            int cur = pointers[n] - &counts[0];
            int pos = cur;
 
            while(cur > 0 && !(counts[pos] < counts[cur-1]))
            {
                --cur;
            }
            if(pos != cur)
            {
                swap(counts[pos], counts[cur]);
            }
        }
 
        while(counts[0].second > K)
        {
            if(it + runLen == end(v))
            {
                break;
            }
            int n = *(it + runLen);
            --pointers[n]->second;
            int cur = pointers[n] - &counts[0];
            int pos = cur;
 
            while (cur < counts.size() - 1 && counts[pos] < counts[cur+1])
            {
                ++cur;
            }
            if (pos != cur)
            {
                swap(counts[pos], counts[cur]);
            }
            ++runLen;
        }
 
        if(it + runLen == end(v))
        {
            break;
        }
        else
        {
            if(runLen < minRun)
            {
                minRun = runLen;
                minStart = startPoint;
            }
        }
        lastRun = runLen;
    }
     
    FILE* file = fopen("maxk.out", "w");
    fprintf(file, "%d", minRun);
    fclose(file);
     
    std::cout << minRun << std::endl;
    std::cout << "Starting at position: " << std::distance(begin(v), minStart) << std::endl;
    auto diff = std::chrono::high_resolution_clock::now() - startTime;
    std::cout << "Time: " << std::chrono::duration_cast<std::chrono::milliseconds>(diff).count() << std::endl;
    return 0;
}

 

But seeing that it's supposed to be a 1/5 on the difficulty scale I feel like we're missing some easy way of doing this.

 

1474412270.2748842

Link to comment
Share on other sites

Link to post
Share on other sites

50 minutes ago, fizzlesticks said:

Was able to get 10/10 with this code.

  Hide contents

 

But seeing that it's supposed to be a 1/5 on the difficulty scale I feel like we're missing some easy way of doing this.

 

We definitely are.

Also , people are getting 100% with 0.7kb sources, while ours are almost 4kb .

i5 4670k @ 4.2GHz (Coolermaster Hyper 212 Evo); ASrock Z87 EXTREME4; 8GB Kingston HyperX Beast DDR3 RAM @ 2133MHz; Asus DirectCU GTX 560; Super Flower Golden King 550 Platinum PSU;1TB Seagate Barracuda;Corsair 200r case. 

Link to comment
Share on other sites

Link to post
Share on other sites

6 hours ago, Philosobyte said:

I see why my original idea wouldn't work. I really should have seen it a long time ago -_- 

 

However, I can see how it can be of use, before you go into the main algorithm. What if you iterate through the sequence until you get to the first number that exceeds k, and then iterate backwards from the end of the sequence until we get to the first number that exceeds k from that direction, and remove all the numbers in between them (inclusive)? This procedure can be carried out either none or multiple times, until the index from the reverse direction is smaller than the index from the forward direction. After this procedure is done, then you can continue on with the algorithm you've already implemented.

 

I can't see how this would result in noncontiguous sequences or suboptimal sequences being removed - theoretically any sequences you remove after this initial procedure will always be contiguous with the sequences you remove with your other algorithm. 

 

What do you think? 

Yeah, this is actually one way to do it. It gets 100 points.

const int N = 1000000 + 10;
const int VAL_MAX = 100000 + 10;
 
int a[N];
int n, k;
int freq[VAL_MAX];
 
int main()
{
    freopen("maxk.in","r",stdin);
    freopen("maxk.out","w",stdout);
 
    cin >> n >> k;
    for (int i = 1; i <= n;i++) {
        scanf("%d", a + i);
    }
 
    int idL = 0, idR = 0;
    for (int i = 1; i <= n; i++) {
        freq[a[i]]++;
        if (freq[a[i]] > k) {
            idL = i;
            break;
        }
    }
    if (idL == 0) {
        cout << 0;
        return  0;
    }
    freq[a[idL]]--;
    for (int i = n; i >= 1; i--) {
        freq[a[i]]++;
        if (freq[a[i]] > k) {
            idR = i;
            break;
        }
    }
    freq[a[idR]]--;
    int ans = idR - idL + 1;
    int l, r;
    for (l = idL-1, r = idR; l >= 1; l--) {
        freq[a[l]]--;
        while (1) {
            if (freq[a[r]] < k) {
                freq[a[r]]++;
                r--;
            }  else {
                break;
            }
        }
        ans = min(ans, r - l + 1);
    }
    cout << ans;
    return 0;
}

I was expecting this problem to be more complex.

...

i5 4670k @ 4.2GHz (Coolermaster Hyper 212 Evo); ASrock Z87 EXTREME4; 8GB Kingston HyperX Beast DDR3 RAM @ 2133MHz; Asus DirectCU GTX 560; Super Flower Golden King 550 Platinum PSU;1TB Seagate Barracuda;Corsair 200r case. 

Link to comment
Share on other sites

Link to post
Share on other sites

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now

×