Search the Community
Showing results for tags 'big data'.
-
Budget (including currency): 2000 - 2300 $(CAD) Country: Canada Games, programs or workloads that it will be used for: Data processing, Multi VMs, Gaming (Later) Other details (existing parts lists, whether any peripherals are needed, what you're upgrading from, when you're going to buy, what resolution and refresh rate you want to play at, etc): Initial buid: https://ca.pcpartpicker.com/list/9VhdTJ Ok guys or girls, I want to build my first PC for my Big Data courses. It require to be able to run multi VMs with Linux on it for data processing and I want this pc to last for a long time. So, I decided to go with a i9-10900K with 32GB 3600MHz of RAM, No Graphics Cards because of the shortage (That's why I choose Intel) Case: Also, my initial plan was to build a Mini ITX... I live in a small condo and I wanted something who won't take the whole space. But, because of temperature management, I think that wouldn't been the greatest idea. I choose de H700 case for that. But I also see the H510 case. So, If you have any suggestion in smaller case. I would really appreciate ! GPU / Power Supply (Futurproof) : When the shortage of Graphics Cards will end (If it end..), I want to ram up a solid GPU, like 3080 or more for gaming. I'm not sur on the Watts for the PowerSupply, but with a lot of workload and maybe a future GPU, I want something that would I have no worries about. RGB: Ok, I know this is not necessary.. but when you spend a considerable amount of money on a PC, why not make it a showroom ? I really don't want a PC who look like a car with neon. Maybe just something to pop it up a little. I chose RGBs on the RAM, if you have suggestion to not overkilled it, I follow you. Thank you in advance for all your help, Cheers !
-
Problem definition Given TWO textual files, for each common word between the two files, find the smaller number of times that it appears between the two files. Output the top 20 common words with highest such frequency (For words with the same frequency, there’s no special requirement for the output order). Example: if the word “John” appears 5 times in the 1st file and 3 times in the 2nd file, the smaller number of times is 3 Requirements Split the input text with “(space)\t\n\r\f”. Any other tokens like “,.:`” will be regarded as a part of the words Remove stop-words as given in Stopwords.txt, such as “a”, “the”, “that”, “of”, … (case sensitive) Sort the common words in descending order of the smaller number of occurrences in the two files. In general, words with different case or different non-whitespace punctuation are considered different words. Archive.zip
- 1 reply
-
- map reduce
- hadoop
-
(and 2 more)
Tagged with:
-
In a not so remote past, we thought they were seeing our e-mails. Then, they could know our searches on the net. Now, "they are the cloud", they host our personal and family photos, our favorite channels selections, our chat conversations, and our text documents. This so-called innocent cloud can make our lives easier by saving our passwords and taking care of our calendar. Today, they are also ISPs, been able to trace just anything we do online. Even if you're not their customer, you may use their browser, which is awesome, but I can't say the same about its privacy. They sell machines we put in our homes to ask anything, while the device hears every word we say. They can check our phone's GPS. They have the largest map and satellite image service, seen every single place on the globe, many times with a camera passing by car in front of our homes. They are mobile networks, so it's possible to hear our conversations. They run the keyboard of our phones to know what we type. They sell routers to make home wireless easier and to collect data. They own data centers that host a big number of information from big companies. They have autonomous cars. They commercialize domains and provide reCAPTCHA. They translate anything to anything and, most important, they know what we do translate. We can pay things with their services... They.............
-
Over the past few weeks, WikiLeaks, known for releasing information and communications of some of the most powerful government players, has been releasing daily dumps from the compromised email account of Hillary Clinton's campaign chair, John Podesta. In a new email release (ID 37262), it has been revealed that Google's Eric Schmidt detailed a rough draft back in 2014 to use the cloud (big data) to create a system in which to aggregate voter information. Excerpt: Depending on how much data Google intends to aggregate, this may not be an issue. That being said, with the number of data breaches increasing in recent months, this really gives me pause. I have mixed things about this. As a geek, I love the idea of things being more connected and the ease of access to information. How can I be assured that the data will be restricted in certain ways? How many different eyes will see the information? I've donated to candidates this election, so will third parties be able to hand this information off to Google at their leisure? Given that a number of states use electronic voting machines, will Google somehow have access to this information? What are your thoughts?
-
ITX Case help New Gaming/School( ill try) Rig
Mtlstreetboy posted a topic in New Builds and Planning
Hi! This is the parts I will be putting in MyComputer. I would really like to make a ITX computer and I was looking into the Fractal Design Nano S with window (+5 to ice skills, you know ). Heard there might be some problem with the GPU being 2,5 slots. Currently I did buy the Samsung M.2 and the Strix Gtx 1080 Ti. I did not mention anything about the price since this is the money I am looking to spend on it, not more maybe less! Thank you for your time! (If you guys go to much in Nerd specs language I will certainly not understand anything unless it is detailed ) -
China made an analog of USA FICO service (credit scoring) but for social life. It takes data from municipal, judicial, commercial authorities. Like Alipay (chinese paypal, used everywhere by 448 million people with $23 billion money transfered), WeChat (chinese Facebook, 938 million active monthly users) to make profile/ranking of user's life ( where you go and how much you spend, what are your interests). It is already being tested in 30 cities and is planned to go all country in 2020. The most massive atm is Zhucheng city with 160,000 parameters coming from all over the city about every one of its 670,000 citizens. In Zhucheng every citizen was given 1000 starter points. No document clearly states what you should and shouldn't do. It is only known that if your rating is more than 1,050 points, then you're a model citizen and marked with AAA. With a thousand points, you can count on AA. With 900 – B. If rating falls below 849 – you are already suspicious carrier of C rating, you will be expelled from work in state and municipal structures. And those who have 599 points and below are going to have bad time being alive. Their record is in the black list with the note D, they become the outcasts of society, they do not take almost any kind of work (even in a taxi), cant get loans, buy tickets for high-speed trains and airplanes, no rent for car and Bicycle without deposit. Neighbors avoid you like fire, because if someone sees that you communicate with the person D and your ranking also rapidly goes down. Sources: https://en.wikipedia.org/wiki/Social_Credit_System http://carnegie.ru/commentary/71546 Translated a little from article above to make some sort of picture. Hello Orwell 1984 ? Or such ranking is a must have in our modern society? Like we are anyway going into electronic identification and government. There will be privileges and restrictions based on our life history anyways. And China is making a centered rating out of it today, sounds logical to me (if it works, which is hard to make in code and legal ways) I wonder if based on machine learning with enough data you could predict whole persons life or at least parts of it.
- 9 replies
-
- china
- big brother
-
(and 1 more)
Tagged with:
-
Sources: New York Times, Author's Note (via Google Docs), OSF (pre-print) wang_kosinski.pdf Comments could get nasty here. I need your moderating powers @iamdarkyoshi for anyone with comments against the CS (political/religious rants) or derails the topic. It resembles a lot of a topic I posted before about an AI predicting the outcome of a relationship. I won't lie that it got me concerned for a bit but reading the actual paper and the FAQ (Google Docs) the authors provided made me think it's indeed a way to warn people about the repercussions of artificial intelligence and machine learning. It reminds me when Elon Musk said that Mark Zuckerberg doesn't know the dangers of AI if left unregulated. I'm with Elon Musk on this one. On the lighter side of things, this technology can be used by dating apps targeted to gay individuals but then, I'd rather see the person face to face rather than letting an AI do it for me. Looking at the comments of other experts, it's obvious that the algorithm is very limited on their sample size and subjects since they only used gay white men and women. The thesis authors said in their Google document that: Just imagine you're a tourist in a country and a facial scanner at the airport identifies someone as gay or lesbian, they can immediately put you in a database. Just imagine oppressive countries who have staunch religious doctrines using this to identify closeted gay people and have them thrown from top of buildings or just be shamed by society into committing suicide. Or better yet, just imagine North Korea developing an AI who determines if someone has doubts on the regime or is planning to defect to South Korea. Knowledge is power and just like a knife that can be used to stab someone or cut that delicious medium rare steak, this algorithm is also a double edged sword. At the moment, it's not ready for prime time as it has limited sample diversity and the authors admit that. In fact as stated in their Google Doc that they want to be wrong and they are terrified of the results. More refinement is definitely needed and just like any scientific theory, it needs to be replicated. At the moment, all humanity knows about the potentials and risks of AI and machine learning is just at the tip of the iceberg. So there's no need to be concerned or alarmed just yet but we also shouldn't dismiss a scientific finding just because we disagree with it or it could pose harm to us and in fact, I would like for this paper to be published soon in the Journal of Personality and Social Psychology. I'm not a Psychology major but I encourage people with that degree to replicate this thesis but with much diversified subjects. Also, it should be kept in mind that "peer-review" is not always cranked up to be as a method of validating or dismissing a theory. Peer-review is important in academiai but most of the time, it's just validating research methodology, checking grammar and look for suspicious and interpolated/forged results. Peer-review doesn't determine if a thesis' conclusion is right or wrong. In fact, there are a lot of bogus theses that are "peer-reviewed". The only way to determine if a thesis' conclusion is right or wrong is if it's replicated and check the results to see if they match up or not. [See NCBI, Nature, CBC, WSJ) Again, I highly encourage everyone reading this thread to read this Google Doc as it clarifies a lot of concerns on privacy, human rights, etc. https://docs.google.com/document/d/11oGZ1Ke3wK9E3BtOFfGfUQuuaSMR8AO2WfWH3aVke6U/edit?usp=sharing
- 64 replies
-
- neural networks
- artificial intelligence
- (and 3 more)
-
I'm building a server for my company where I'd train a model and later run it on full scale (so it should have options for storage and scaling. What hardware should I use on it? preferring AMD over Intel and I'll need multiple Geforce RTX 2080 Ti (I think, I'm not sure about this)
- 1 reply
-
- big data
- neural network
-
(and 2 more)
Tagged with:
-
Struggling with Excel 2016! Please HELP ASAP
Sparky_Fr posted a topic in Programs, Apps and Websites
Hi All, I'm sure this sounds like a noobie question but here I go : I have a list of 4000+ individual attendees from 100+ companies. I'd like to crate a chart that adds up all my attendees by company (ex : Company 1 = 43 / Company 2 = 15 / Company 73 = 21 ) When I try creating a chart the "normal way" it adds up all the company names one after another and leaves it all in one block of incomprehensible nonsense. Any help is greatly appreciated! Thanks, Adam -
Not the 'minority report' kind of precrime that you were expecting.. Basically a variant of risk prediction technology Nevertheless this is an interesting story , The artificial intelligence (most probably like the heuristic analysis like AV's use) was developed to scan employees emails to search for pattern on whether the employee is potentially going to to giveaway/sell classified business details of the company, The artificial intelligence system developed by UBIC’s system dubbed Virtual Data Scientist, can sift through messages and identify senders whose writing suggests they are in financial straits or disgruntled about how their employer treats them. The system is built on established human expertise.The Virtual Data Scientist trains itself by studying and emulating the techniques of professional auditors. It can then bring those techniques to bear by scanning massive volumes of email. UBIC says it’s more efficient than traditional manual keyword searches and that even subtle indications of fraud can be detected. The approach links machine learning with analysis of big data and behavioral sciences such as psychology and criminology. The emerging field is known as "behavior informatics" and it has its own IEEE task force and other research groups. UBIC’s system currently works in Japanese only, but support for English and other languages is being added http://www.pcworld.com/article/2691132/artificial-intelligence-system-can-predict-data-theft-by-scanning-email.html Post your thoughts & rants down below, is it a fair system to snoop on employee email regardless of the risk it carries? post your comments below
-
It looks like the UK is at it once again, FACT, or the Federation Against Copyright Theft has teamed up with computing giant IBM to utilize their "Big Data" intelligence analytics software to gain access to data that would not normally be available. The new team seeks to shut down services that offer illegal content, such as the now non existent SurfTheChannel. However it doesn't seem to be targeting those who are downloading such content, only those who are uploading it. The UK has become relentless with their Internet policing and it is time to stop before something significant will happen which would disrupt the country. We are becoming a choked nation which is likely to revolt within the next few years if this kind of thing continues are develops. There will be ways around these things set in place by the government however they will become much more difficult to find. Original Article: http://torrentfreak.com/fact-teams-up-with-ibm-to-track-down-movie-and-tv-show-pirates-130910/