Jump to content

Analyzing Backpack and Screwdriver Data for Fun

Safe-Ad9468

Introduction

For the past few months, I have been working on a small project on learning how to code and play with difference pieces of software and technologies.

Using python to web scrap data several times an hour, at first running locally to putting the code on a NAS running Unraid to host the docker container, I was able to collect data on the LTT backpack and LTT screwdriver. Namely, on the website you would see:

 

image.png.fd531d06b0eb99d549f293d9d3fc167e.pngimage.png.34652c8c7c8cb4d989fc1d7b6ba62bae.png

 

There are #### screwdrivers remaining in wave X.

or

There are #### backpacks remaining in wave X.

However, on October 13, 2022, the LTT store updated its website and from a quick glance, it seems that the screwdriver data has been consolidated into a single number. Therefore, I thought that now would be a good time to share what I have gathered.

 

Data

I collected data on the backpack from 2022-08-19 to 2022-10-13 (55 days)

and

I collected data on the screwdriver from 2022-08-30 to 2022-10-13 (44 days).

 

I lost about 10 hours of screwdriver data on October 30 and about 1 to 2 days worth of data on October 8 and October 9.

In terms of data collection frequency, I started to collect data on the backpack about every 15 minutes until September 10th, when I started collecting data every 5 minutes. For the screwdriver, I started off collecting data every 5 minutes.

One last note: I am making the assumption that if it said “There are 12345 backpacks remaining…” and then 5 minutes later it said “There are 12340 backpacks remaining…”, then that means 5 backpacks were ordered in 5 minutes. I’ll be referring to these are “orders” moving forward.

 

Overall Statistics (Backpack)

Let's look at backpack orders over time (note the data anomalies mentioned above):

 

image.png.8c74f083776548ccc43f3cafd878acae.png

 

Over the course of 55 days the quantity of backpacks went from 19,300 to 15,188, for a total change of 4,112, which is roughly about 75 backpacks per day. However, averages do tend to hide certain pieces of information, so if we look at the figure below, we see that backpack sales tapered off somewhere in late September.

 

image.png.0300cc8568747b3e40ad7a11e550f125.png

 

We can also see a spike occurred towards the end of August also.

If we look at sales during the day of the week, it's pretty clear that most of the sales occurred on Tuesdays and Wednesdays. However, these data are skewed due to the high sales that occurred on August 30 and August 31, Tuesday and Wednesday respectively.

 

image.png.44e61b0883778796294e8064c05a98d4.png

 

We can breakdown the data by quantities in a 5 minute window. It is possible to make a strong assumption and say that at most, only one order can be place every 5 minutes but it is highly possible that 2 or more orders can be placed in a 5 minute window. In either case, if we assume only one order every 5 minutes (this is thus only using data when it was collected every 5 minutes: September 10 to October 13), then we see the following distribution:

Order Qty

Count

Cancels: -6

1

Cancels: -5

1

Cancels: -4

2

Cancels: -3

14

Cancels: -2

41

Cancels: -1

184

No Orders

6,063

Order: 1

1,031

Order: 2

112

Order: 3

13

Order: 4

1

 

Obviously most of the times there is no change but we see that the majority of the orders are for a quantity of 1. We actually also see some negative quantities. These “could” be order cancellations. We can see if there is a pattern to these order cancellations:

 

image.png.cefeafc97fcf476b71a428cfb0b0fd85.png

 

However, it seems that there isn’t really a pattern.

 

Overall Statistics (Screwdriver)

One important note about the screwdriver data. The screwdriver data scrapped isn’t actually what was displayed on the website but instead was scrapped from the source code. However, I noticed that when the number on the website decreased, so did one of the four values in the source code (more to come).

 

Let's look at raw screwdriver data over time (note that data anomalies mentioned above):

 

image.png.077d2ba3e6d689642dca124320798bc8.png

 

Over the course of 44 days, we see the quantity of screwdrivers went from 65,600 to 49,991, for a total change of 15,609.

 

There are few interesting things to note:

 

While the starting point of my data was 65,600, there was actually a "jump" in quantity on September 2, where it jumped to 80,021; this should be clear in the graph above. By taking the summation in reverse, we can adjust, which is shown in the graph below:

 

image.png.6c0033786ab76ae6538d2f3ea42520f0.png

 

One guess is that perhaps there was an increase number of screwdrivers that would be manufactured and the quantity in the source code was adjusted upwards then. However, just a far out guess.

 

We can now see that the total order of screwdrivers over 55 days is 65,555 and not 15,609.

 

Secondly, there seems to be a large drop in quantity around August 31. I’m not sure of the reason for that dip. I’m guessing that there is a correlation with the drop for the backpack also.

 

Thirdly, what isn't shown in the graph above is the change in quantity again. While there was only one backpack, there are 4 different variants of the screwdriver. Sometime on September 21, there was a change in the underlying data where the quantity shifted but the overall total remained the same. This is clear in the graph below:

 

image.png.3db15dc6bb0e1a74029b36b5af419022.png

 

You see that that the quantities for “type_2” and “type_4” drastically decreased on September 21. My guess would be that this is due to a future decrease in production of the screwdriver with the black shaft as I believe LMG said that they won’t be producing a lot of screwdrivers with the black shaft. However, again, this is just a guess.

 

Below, I look at overall orders by day.

 

image.png.42c5c61abda7baafd8c3d3eb8e06e01e.png

 

We see that Wednesday completely surpasses the other days. When looking at the raw data, there are quite a few orders that occurred on August 31, which we can see in the quantity drops in the first 2 graphs. The graph below omits the data prior to September 2 due to the jump and rapid decrease in quantity, so that we can perhaps see a trend (maybe?):

 

image.png.a69af1badaa585e30b6bc0e3a2c95d02.png

 

We see that unlike the backpack, the screwdriver orders seem to have occurred on the weekends.

 

We can also breakdown the orders by day by type:

 

image.png.798ba2b2dca501b2c4235bec853e8707.png

 

It’s clear that “type_3” was ordered the most by far with “type_4” and “type_1” very close behind. It’s hard to tell by the graph above but the table below makes it clearer:

 

type_1

type_2

type_3

type_4

Sunday

1,306

989

2,273

1,300

Monday

909

646

1,756

913

Tuesday

836

582

1,616

816

Wednesday

751

544

1,502

861

Thursday

667

413

1,499

677

Friday

636

429

1,194

665

Saturday

1,390

986

2,412

1,451

Grand Total

6,495

4,589

12,252

6,683

 

Following similarly what was done with the backpack, we can see what is the most common quantity of screwdrivers ordered:

 

image.png.0c3b8eaeb9ff4b025f9f4d6e5ab8d35a.png

 

Interestingly, we see multiple orders for multiple screwdrivers. Again, this is a strong assumption of one order per 5 minutes, so this would be the upper bound of the number of screwdrivers per order.

 

And below is a similar graph broken up by type:

 

image.png.859a3a3674e9466373f5fc0a215cb5bc.png

 

Finally, looking at cancellations, there does not seem to be a clear pattern:

 

image.png.f3032147d90e0452dc6749622797119c.png

 

Conclusion

This was a little project that I have been working on in my spare time to familiarize myself with some of the different software and technologies that I had access too. Instead of keeping all the results/data, I thought it would be a fun exercise to share.

All errors are my own. Thanks for reading.

Link to comment
Share on other sites

Link to post
Share on other sites

19 hours ago, that_dude said:

Fundermentales of data havesting:

1. Ask.

2. Reduce your impact.

 

In your particular case you have monitored two pages ever 5 minutes for 55/44 days.

(55+44)*24*60/5=712'800 request!!!

That amount of requests is nothing for a busy site. Also internet is kind of free to use, so really I don't see reason to ask if I can use my web crawler on your website.

Link to comment
Share on other sites

Link to post
Share on other sites

On 10/18/2022 at 12:32 AM, that_dude said:

Have people acccidently taken down pages? Yes. Sometimes it is just a shitty server beeing overwhelmed. Sometimes it is a programming error on your side causing unusual high loads.

At that point your service has a fault. D(DoS) protection is something service provider has to account for.

Link to comment
Share on other sites

Link to post
Share on other sites

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now

×