Jump to content

[UPDATE] Python project - I made a Twitter bot that will tell you if your tweets are more like Donald Trump or Hillary Clinton.

56 minutes ago, randomhkkid said:

Yep! I built a k nearest neighbour algorithm that performed at around 65 to 70% accuracy on a PCA/SVD reduced training and test set. My current SVM implementation has been tuned to roughly 87% accuracy for the same data. I need to go back and run comprehensive tests to compare my different algorithms but i'm hoping a neural network can get me into the 90s

How about random forests? And how about ensemble methods?

Personnally I liked playing with random forests, gradient boosting classifiers, and SVms , and making a mean of those and tuning that.

Link to comment
Share on other sites

Link to post
Share on other sites

7 hours ago, laminutederire said:

How about random forests? And how about ensemble methods?

Personnally I liked playing with random forests, gradient boosting classifiers, and SVms , and making a mean of those and tuning that.

Interesting! I'll look into that, I've only ever heard if random forests for branch prediction in decision trees. 

Data Scientist - MSc in Advanced CS, B.Eng in Computer Engineering

Link to comment
Share on other sites

Link to post
Share on other sites

4 hours ago, randomhkkid said:

Interesting! I'll look into that, I've only ever heard if random forests for branch prediction in decision trees. 

Well random forests basically are a lot of decision trees put together. There are subtleties with how you train individual decision trees. As in do you restrict features they can see, do you shuffle the data points and how etc. Random forests are usually pretty good. Of course it'll depend on the problems, but they can be as powerful as SVM, sometimes even more so.

I mention those since they're immensely more easy to implement than neural networks. At least in theory, because NN require to be well designed to converge properly, while random forest are trained as k-nn or svms, at least in scikit learn

 

Link to comment
Share on other sites

Link to post
Share on other sites

  • 2 weeks later...
12 hours ago, laminutederire said:

Hey!

Saw That and thought about your project :)

Something like that with your neural network could be damn fun to be honest :)

 

 

That's a great video thanks! Building a multi-layer perceptron at the moment so that could be a cool aside if I get it done :D

Data Scientist - MSc in Advanced CS, B.Eng in Computer Engineering

Link to comment
Share on other sites

Link to post
Share on other sites

57 minutes ago, randomhkkid said:

That's a great video thanks! Building a multi-layer perceptron at the moment so that could be a cool aside if I get it done :D

Ahah you didn't wanted to do a one layer perceptron ? :P

Does everything converge correctly?

Link to comment
Share on other sites

Link to post
Share on other sites

33 minutes ago, laminutederire said:

Ahah you didn't wanted to do a one layer perceptron ? :P

Does everything converge correctly?

Yep I implemented a single layer logistic regression perceptron using a sigmoid activation function! Getting ~87% accuracy which is pretty impressive, trying to improve on that with more layers.

Data Scientist - MSc in Advanced CS, B.Eng in Computer Engineering

Link to comment
Share on other sites

Link to post
Share on other sites

7 hours ago, randomhkkid said:

Yep I implemented a single layer logistic regression perceptron using a sigmoid activation function! Getting ~87% accuracy which is pretty impressive, trying to improve on that with more layers.

I was just teasing a bit since every multi-layer perceptron can be represented as a single layer one. It's just that the transformation is non trivial and a real pain :)

Anything between 85 and 95% should be well enough i guess. You don't want to necessarily overfit. Even with your SVM results you were pretty much spot on, tweaking it could've yielded a good enough result without over fitting on the data too much. That was my biggest mistake during my first kaggle project, hence my emphasis on that advice!

(Purposely over fitting is a good exercice though)

Link to comment
Share on other sites

Link to post
Share on other sites

4 hours ago, laminutederire said:

I was just teasing a bit since every multi-layer perceptron can be represented as a single layer one. It's just that the transformation is non trivial and a real pain :)

Anything between 85 and 95% should be well enough i guess. You don't want to necessarily overfit. Even with your SVM results you were pretty much spot on, tweaking it could've yielded a good enough result without over fitting on the data too much. That was my biggest mistake during my first kaggle project, hence my emphasis on that advice!

(Purposely over fitting is a good exercice though)

haha yeah it took me an embarrassing amount of time to get my head around all the terminology floating around. Softmax, logistic regression, multi-layer perceptron, neural network, epoch...etc I swear they're not even trying to name things so that they're easy to understand :P

 

Yeah I'm pretty happy with the results so far, both politicians are talking about the same topics with different slants on it which makes this a particularly hard problem for a classifier. Trying to prevent over fitting by doing various things like stopping training when cost falls below a threshold or by dropping random nodes each epoch .

Data Scientist - MSc in Advanced CS, B.Eng in Computer Engineering

Link to comment
Share on other sites

Link to post
Share on other sites

1 hour ago, randomhkkid said:

haha yeah it took me an embarrassing amount of time to get my head around all the terminology floating around. Softmax, logistic regression, multi-layer perceptron, neural network, epoch...etc I swear they're not even trying to name things so that they're easy to understand :P

 

Yeah I'm pretty happy with the results so far, both politicians are talking about the same topics with different slants on it which makes this a particularly hard problem for a classifier. Trying to prevent over fitting by doing various things like stopping training when cost falls below a threshold or by dropping random nodes each epoch .

   Ahah when seen in a proper course it feels more natural, because you see the problem and try to solve it and you end up building softly or logistic functions :)

 

   As for the notations I hear you! I had the same issue with differential geometry..

 

   Hmm yeah that works well with other methods (you can drop data points or features randomly as well to see if it's better  (which is a common method for random forests )).

I heard about boosting methods for random forest which could be implemented with neural networks, have you seen that?

 

   Edit: read through the first article in Google search which is the first in Google scholar and it seemed appropriate enough to share it with you to illustrate what I asked right above (here is the Link to the pdf).

 

   Oh and have you tried convoluted nodes and node pools? Works good with image since those can have the same features when a translation is applied. It could make sense in your case since the words can be placed at different places in the sentence. (It depends on your representation of the data though  (does the order count in yours?))

Link to comment
Share on other sites

Link to post
Share on other sites

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now

×