Jump to content

AI is the buzzword de jour, and for good reason: It’s enabling a LOT of new and useful tools for everyday life. But just what IS it? And why do we think it’s being portrayed dishonestly?

Emily @ LINUS MEDIA GROUP                                  

congratulations on breaking absolutely zero stereotypes - @cs_deathmatch

Link to comment
Share on other sites

Link to post
Share on other sites

All one can hope is that tesla paying all buyers 30k per year they missed on robo-taxing will calm the hype a bit.

Link to comment
Share on other sites

Link to post
Share on other sites

This was a really well researched video, I'm glad someone with the reach of Linus is demistifying the AI buzzwords.

 

The estimate of the compute of the human brain are all over the place. It depends if you count the compute needed to do a molecular simulation, or the compute from the fundamental events per second of a neuron, which is surprisingly low. The chart is also quite old. It doesn't help that specialized GPU accelerators are a lot faster, and specialized NPU are even faster than that, and I have no idea how relevant are them to the compute of the human brain.

image.thumb.png.90826696fb5d0bfd40d88832b0c1f41d.png
In terms of fundamental operations per second (as in event per neuron per second multiplied by number of neurons), a modern processor might already have enough compute to run a brain like algorithm, if only we had any idea how to wire a brain like algorithm. How much hardware is theoretically needed to run a human level AGI is an open question.

 

Our brain does GI with 20W. The first AGI is likely to use a building worth of accelerators taking tens of megawatts of power to run.

Just like you can have an unoptimized game gobble up all your CPU and run like garbage, and optimizations can make it run better, I personally do expect a human level AGI to eventually run on a laptop like hardware.

 

 

image.thumb.png.475f9f28e4616a47978d9d81a2385b49.png

That's a great point. Another nail in the coffin for Teslas is that they have no sensors other than cameras. No lidar, nor parking sensors, not even rain sensors for the wipers. And yes, the wipers don't always detect rain. There is zero chance Tesla is going to get a level 5 certification without significant hardware changes to the car, even if they do solve their autopilot.

 

Link to comment
Share on other sites

Link to post
Share on other sites

Another banger video all about how nonsense some of the marketing is.
Calling a Mouse an AI MOUSE is a good example - the feature = you click the mouse to open AI apps. duh 😕

Maximums - Asus Z97-K /w i5 4690 Bclk @106.9Mhz * x39 = 4.17Ghz, 8GB of 2600Mhz DDR3,.. Gigabyte GTX970 G1-Gaming @ 1550Mhz

 

Link to comment
Share on other sites

Link to post
Share on other sites

Something that's always an interesting experiment with OpenAI and services like it is asking it to explain the plots of TV episodes. For something like a Simpsons episode (maybe the most extensively documented TV show on the entire internet) it'll usually give pretty accurate plot summaries with maybe some minor details gotten wrong. Something popular but with a somewhat less extensive fandom like Frasier will have plots get described in broad strokes but screw up major plot points. And if you give it a really obscure show nobody gives a shit about (like "Minoriteam") it'll probably just make something up. 

Link to comment
Share on other sites

Link to post
Share on other sites

THANK YOU for making this video!

Maybe I'll be less cranky about AI issues in your forums.

(eh, probably not... but really, THANK YOU!)

Link to comment
Share on other sites

Link to post
Share on other sites

2 hours ago, Error 52 said:

Something that's always an interesting experiment with OpenAI and services like it is asking it to explain the plots of TV episodes. For something like a Simpsons episode (maybe the most extensively documented TV show on the entire internet) it'll usually give pretty accurate plot summaries with maybe some minor details gotten wrong. Something popular but with a somewhat less extensive fandom like Frasier will have plots get described in broad strokes but screw up major plot points. And if you give it a really obscure show nobody gives a shit about (like "Minoriteam") it'll probably just make something up. 

another usage is tracking down song played in a tv show or movie. they they dont list correctly. due to some strange legal reason.

MSI x399 sli plus  | AMD theardripper 2990wx all core 3ghz lock |Thermaltake flo ring 360 | EVGA 2080, Zotac 2080 |Gskill Ripjaws 128GB 3000 MHz | Corsair RM1200i |150tb | Asus tuff gaming mid tower| 10gb NIC

Link to comment
Share on other sites

Link to post
Share on other sites

4 hours ago, GoStormPlays said:

Favorite LTT Thumbnail. 

Does the thumbnail imply that the beard was AI-generated all along?

Please use the mark as solution feature if your query was resolved.
If multiple people people contributed to a solution, pick the most helpful contribution.

 

https://pcpartpicker.com/list/TM4Hyg

Link to comment
Share on other sites

Link to post
Share on other sites

So glad y'all made this video. "AI" today is certainly shrouded in a lot of mysticism these days, and the companies making them would sure love to keep it that way. A lot of non-technical people I know are confused on AI and it is often difficult to dispell the magic airs around it.

Link to comment
Share on other sites

Link to post
Share on other sites

I love that Linus created this video.

ChatGPT, DALL-E, and other AIs that are able to process a huge amount of things are considered to be Deep Learning AI, (..or ai that uses deep learning algorithms) which essentially means that they have more layers to them (more than 2).

Most Deep Learning AI are further categorized into many categories, such as:

- Natural Language Processing (NLP)

- Image Processing

- Image Generation

- Context Detection, etc.

 

each with their own algorithm examples.

I think that it's important to know this, since this helps the normal consumer, to manage our expectations, at least for now, with hopes that General Artificial Intelligence can be created in the near future, however far it is.

General Artificial Intelligence can be really challenging to tackle though, since this requires a lot of computing power, and by a lot, i mean a LOT! ChatGPT in itself required a huge amount of data for it to be finally useful for us, even though it still gives us wrong answers sometimes.

 

But all and all, here's to hoping AI can help us a lot more, ..... and hopefully not destroy humanity. Great video! 😄

Link to comment
Share on other sites

Link to post
Share on other sites

5 hours ago, LoweGule said:

General Artificial Intelligence can be really challenging to tackle though, since this requires a lot of computing power, and by a lot,

I strongly suspect it's not just about the number of parameters, and more about adding feedback loops.

 

Lot's of current models are basically one directional, even LLM that some inherent bidirectional behavior. I feel very confident that an AGI will incorporate a data structure that is tree like and not vector like, and that the circuitry will need feedback loops.

Link to comment
Share on other sites

Link to post
Share on other sites

22 minutes ago, 05032-Mendicant-Bias said:

I strongly suspect it's not just about the number of parameters, and more about adding feedback loops.

 

Lot's of current models are basically one directional, even LLM that some inherent bidirectional behavior. I feel very confident that an AGI will incorporate a data structure that is tree like and not vector like, and that the circuitry will need feedback loops.

Feedback loops means that "stability" needs to be addressed, training might not converge anymore but oscillate between different solutions, and both training and inferencing become even slower and not as parallel as they are now. I suspect that this has already been tried at least in papers.

Link to comment
Share on other sites

Link to post
Share on other sites

51 minutes ago, fulminemizzega said:

Feedback loops means that "stability" needs to be addressed,

Indeed.

 

Assessing the stability of a control system is really hard, control system theory has things like phase shift, and you know if you get close to 180° the sign changes and the feedback becomes positive. You experience that when a mic and a loudspeaker interact, the frequency that comes out is the one with 180° phase shift at gain above 1.

 

It's hard enough for LTI (Linear Time Invariant), but it's already almost impossible to solve symbolically with non linear systems, and the the problem scales incredibly bad with the number of inputs, outputs and feedback variables. And neural network have non linear stages in between layers, and ginormous tensor matrices.

 

It's clear there is work to be done to make deep NN with feedback.

Link to comment
Share on other sites

Link to post
Share on other sites

Where to start...

 

First, LLMs don't "understand" language. For language to be "understood", one needs a metaphorical mental representation of the things spoken about. Simply re-encoding input letters into numbers and back again, like LLMs do, won't suffice.

 

Second, ELIZA was not based on Neural Networks; it was a rule-based system. Neural Networks had not have been implemented for another 5-ish years before ELIZA was first released (although they already existed as theoretical constructs).

 

Thirdly, @LinusTech/@Emily Young step into the "Sell ∀∃ as ∃∀ Scam" trap, but for that, i have to stretch out "a bit":

 

I think, the best analogy how Neural Networks work is a mixing console, like the ones audio engineers use: Imagine a mixing console with two inputs, an synthesizer that can only produce only a single sine wave, one single output and billions of knobs that can take any value from 0 to 1. Those knobs modulate the sine wave; some have the effect of equalizers that raise or lower the amplitude of certain frequency bands, some are modulators that stretch or narrow certain timed areas of the sine wave. Noone actually knows, which knob does what exactly, but in their entirety, they re-modulate the sine wave such that the initial clean sine wave can potentially be transformed into virtually every audio signal (which, mathematically, is possible via Fourier transformations).

 

Now, we want to "train" this console. This means, we put in a song's wave form, say Nirvana's "Smells like teen spirit", into Input channel 1, and a human-readable representation of the song (for instance the name, title and performer of the song in text form) into Input channel 2. Now, we let the console choose randomly any setting for it's billions of knobs and compare the output with the input. Chances are minimal that the output will match the input at the first try, so we instruct the machine to slightly alter the random knob settings and test again. We don't want to compare the output of each single trial with the input by hand, so we add a device that lets the console itself compare the input with the output until it matches.

 

Once we have that match, we put in the next song and title, say Beethoven's 9th Symphony into input channels 1 and 2, respectively, and repeat the training process, but with one important distinction: Now, we instruct the machine to find a specific knob setting that is able to produce both songs with only the input of the second channel (i.e., the "prompt") as distinctive feature. One thing to understand is that our training input Channel 1 is used only for control, to have something to check the generated output against. The only real external parametrization is Channel 2, but this input has no effect on what knobs are turned or by how much they are turned, as this happens randomly. It might only alter the way the signal goes though the machine from the sine wave generator to the output channel, for instance: "If the first letter in Channel 2 is an 'A', take route 1, else take route 2"; "if there is a 19th letter and it is a "d" take the left at the 3401290th knob", and so on. But those aren't hard-programmed into the console either, but also randomly "found" by the machine via trial and error again by checking the two Input signals against the output. If there is no (close enough) match, the machine slightly alters both the signal path as well as the knob settings, again randomly. What we do as console programers is to limit the amount of alternation the machine is allowed to make to its setting at each trial, for instance by telling it to alter each knob by at most +/-0.1, or to change only a maximum of 20.000 knobs per trial.

 

The crucial point one needs to understand about Neural Networks is that not the console itself is the "model" we train, but the set states of all the individual knobs anmd signal paths is the model: "Training a model" means to find the best value for each of the knobs (and the best signal path) in order to perform a specific task, in this case, to reproduce songs with just the input from channel 2 as a parameter. Once the model is trained, those knob settings and signal paths are fixed and Channel 1 is closed for good[1].

 

Of course, our "intelligent" mixing console should not only produce two songs, but thousands, or even millions of songs, of all kinds of genres and ages, so training our "music generator AI" means to find the one knob setting and signal path layout that is able to generate every song from only the Channel 2 input. It should be obvious by now, why training those models takes so immensly much time and energy.

 

Another point this analogy should show is that problems which are closely related are "easier" to handle in this machine than wide-range problem complexes: Instead of training our mixing console to produce every song in history, we could build many smaller consoles that specify in smaller problem fields, like specific genres, performers, or eras: one console for Seattle Grunge, another one for Vienna Classic, another one for SciFi Audio Books, and so on. This would not only require way less parameters (i.e., knobs and paths), it would also shorten training times until the "right-ish" knob setting and signal layout is found.

 

Which leads us back to the "Sell ∀ ∃ as ∃ ∀ Scam": AI companies sell their products as "If our solve-it-all machine won't give you the correct output, you just asked the wrong question", a trope commonly known as "Prompt engineering". They sell their products as "There exists (∃) a prompt that gives you for all (∀) of your problems the correct solution". When in fact NNs are designed to work by the principle: For every (∀) task, there exists (∃) a setting that gives the correct answer for that one, specific, task.

 

[1] some modern LLMs leave both the "knob settings" as well as the first input Channel "open" so that the model can be further trained while it is in "usage" mode, but the core design principle remains that the machine's purpose is to find the one parameter setting layout that is able to solve one given problem.

Link to comment
Share on other sites

Link to post
Share on other sites

On 6/14/2024 at 11:10 AM, 05032-Mendicant-Bias said:

You experience that when a mic and a loudspeaker interact, the frequency that comes out

This is a time discrete case and there is no "sampling" involved, but anyway, besides the non-linearity, this introduces a concept of "time" both during training and inferencing, the network would need to be evaluated multiple times to have it reach the correct output value. Anything that is connected to the output of neurons that have a feedback loop has to be propagated again, maybe only a portion of the NN may have this requirement, but it looks quite expensive.

And memory has to be accounted, there is state to be stored, and it grows both with the n. of feedback loops and with the "order" of the system (a neuron's output may depend on its past value but also on its past-past-state, and so on...).

The more I think on this, the more it just looks like another way to make a network just deeper: say you have a neuron that has a feedback loop and it depends on its past value, you decide that 2 propagations are enough, so you evaluate the same input 2 times and then you read the output. This could be done by using 2 neurons,

where the 2nd shares the same inputs as the 1st but also receives the 1st's output. Maybe.

Link to comment
Share on other sites

Link to post
Share on other sites

23 hours ago, fulminemizzega said:

This is a time discrete case and there is no "sampling" involved, but anyway, besides the non-linearity, this introduces a concept of "time" both during training and inferencing, the network would need to be evaluated multiple times to have it reach the correct output value

It is my opinion that embedding temporal relationship in the latent space is fundamental to achieving an AGI. If you look at SORA cherrypicked videos it has curious artefacts of time reversed sequences in there. it knows what opening a door looks like, but it doesn't handle the direction time flows right.

Sampling and quantizing does have an effect on stability, but in weird ways. Just like quantum particles behave weirdly because they only have discrete values, I guess. E.g. with digital amplifiers, there can be noises at pure frequencies instead of pink and white noise.

 

Having feedback does increase the effective depth of the network, I suspect that's the whole point, with the same layers, you might be able to artificially increase the effective layers. I also suspect that cross connections between nets will be needed, like having layer 3 of network B wired to layer 2 of network A, and layer 5 of network A connected to layer 4 of network B. It's going to be a nightmare to figure out how to train such a net, but you only have to figure it out once.

Link to comment
Share on other sites

Link to post
Share on other sites

I actually wrote a whole novel about AI last year and only published it recently. I've seen quite a few things I envisioned when writing Heart Of Circuits that are unfortunately becoming a reality.... Except there's a canine robot of an angry welshman who has no shame stealing motorcycles or breaking into people's houses to steal stuff. Don't want to give any spoilers but it's disturbing how my books are slowly predicting the future.

 

Is it weird that when I write a book, something similar to what I describe becomes popular during the later stages of writing or something happens a couple years after the fact? Am I being watched? Are tech companies spying on me while I write my books? What the f**k is going on?!

Link to comment
Share on other sites

Link to post
Share on other sites

This is a great video, everything that I would have wanted to say about it was already said by others. Honestly all I would want an "AI Assistant" for on my PC is for doing things like: "Hey once my Steam downloads have finished start my EA Desktop updates, once these have finished run my Ubisoft updates and after that shut the PC down" or: "Hey tell me which of the .pngs in this .msstyles file are responsible for drawing the min/max/close buttons in Windows Explorer"

 

Apple seems to have a good plan for AI but we'll see how this goes. But overall it feels companies want to fool people with magic tricks and I would argue most companies going for AI for the general consumer haven't earned anywhere near the trust to be allowed to handle (sensitive) user-data.

Link to comment
Share on other sites

Link to post
Share on other sites

I'm curious about the 'autonomy can't be done without Lidar' statement  from the WAN show. If Lidar is 'mandatory' for an autonomous agent to be allowed to drive a car, how come an 2-eyed, forward vision only, frequently distracted, slow reflexed driving agent is allowed on the road? Imho, the question about non-human agents being allowed to drive cars is not if they can be 100% flawless. If they become less phone to (lethal) accidents then humans insurance companies will jump on it and charge you more for the liberty to drive yourself. Next government will step in and restrict human driving. 
 

Link to comment
Share on other sites

Link to post
Share on other sites

2 hours ago, ddeconin said:

I'm curious about the 'autonomy can't be done without Lidar' statement  from the WAN show. If Lidar is 'mandatory' for an autonomous agent to be allowed to drive a car, how come an 2-eyed, forward vision only, frequently distracted, slow reflexed driving agent is allowed on the road? Imho, the question about non-human agents being allowed to drive cars is not if they can be 100% flawless. If they become less phone to (lethal) accidents then humans insurance companies will jump on it and charge you more for the liberty to drive yourself. Next government will step in and restrict human driving. 
 

Cause you have two eyes. People who only have one working eye, or poor depth perception (common among certain populations in the lower mainland) have a very hard time judging distance. 

 

There is a way however to not need Lidar. The car must have a second source of input, that's is either the driver or a municipal-level automatic traffic control system. If an automatic traffic control system exists that the car can be told when lights are green or red, or how many cars are in front of them, then things can work super efficiently as they can properly space vehicles and speed.

 

However relying only cheap low resolution cameras (those parking/reverse-gear cameras are often 4810i off-the-shelf cameras connected by a composite video connector) is not enough. 

 

I'd actually suggest that LIDAR itself not be used except for parking. It's purpose is to create a "3D map" around the vehicle, but that data isn't useful if the car is moving so fast that it can't understand the parallax movement. Polarized radar would probably be good enough (eg horizontal polarized in front, vertical in the rear, 45 degree rotated on the left/right for those sides of the vehicles, so that it doesn't pick up radar from other vehicles. Basically just bounce the car's encoded "VIN" in the signal off objects and if it hits another car, it can get an accurate distance. It could also then realize where that car is relative to itself and establish a car-to-car network link to share it's exact speed and if gas/cruise/brakes are pressed so that sudden slamming on the brakes in front triggers all the cars behind it to do so.

 

Link to comment
Share on other sites

Link to post
Share on other sites

4 hours ago, Kisai said:

Cause you have two eyes. People who only have one working eye, or poor depth perception (common among certain populations in the lower mainland) have a very hard time judging distance. 

 

There is a way however to not need Lidar. The car must have a second source of input, that's is either the driver or a municipal-level automatic traffic control system. If an automatic traffic control system exists that the car can be told when lights are green or red, or how many cars are in front of them, then things can work super efficiently as they can properly space vehicles and speed.

 

However relying only cheap low resolution cameras (those parking/reverse-gear cameras are often 4810i off-the-shelf cameras connected by a composite video connector) is not enough. 

 

I'd actually suggest that LIDAR itself not be used except for parking. It's purpose is to create a "3D map" around the vehicle, but that data isn't useful if the car is moving so fast that it can't understand the parallax movement. Polarized radar would probably be good enough (eg horizontal polarized in front, vertical in the rear, 45 degree rotated on the left/right for those sides of the vehicles, so that it doesn't pick up radar from other vehicles. Basically just bounce the car's encoded "VIN" in the signal off objects and if it hits another car, it can get an accurate distance. It could also then realize where that car is relative to itself and establish a car-to-car network link to share it's exact speed and if gas/cruise/brakes are pressed so that sudden slamming on the brakes in front triggers all the cars behind it to do so.

 

Again, most humans have 2 eyes, looking forward and not paying attention to the road most of the time. On the other hand, even a vision only system like tesla's has 5 or 6 cameras, looking in all directions at the same time and paying attention all the time. So still not quite sure I get why a Lidar would be required for an autonomous Agent to be a better driver than most humans..

Link to comment
Share on other sites

Link to post
Share on other sites

2 hours ago, ddeconin said:

Again, most humans have 2 eyes, looking forward and not paying attention to the road most of the time. On the other hand, even a vision only system like tesla's has 5 or 6 cameras, looking in all directions at the same time and paying attention all the time. So still not quite sure I get why a Lidar would be required for an autonomous Agent to be a better driver than most humans..

LIDAR creates a depth map.

 

Stereo cameras "can't move" like human eyes do. I have a stereo webcam and it can NOT see behind anything even though it has depth. Your eyes can move and thus see the parallax. 

 

LIDAR is "Light Detection and Ranging", and basically is just a constantly running depth scanner. Cameras are easily blinded by light, and do not work in darkness. RADAR works in darkness. LiDAR works in Darkness.

 

Link to comment
Share on other sites

Link to post
Share on other sites

17 hours ago, ddeconin said:

Again, most humans have 2 eyes, looking forward and not paying attention to the road most of the time.

It must be what Musk thought when he decided to make his incredibly difficult auto pilot even harder to make by excluding non camera sensors.

 

Our gelatinous eyes are wired to a GI wetware that runs in a few liters of volume at 20W.

 

Cars need ALL the help they can get to make sense of the world around them, because state of the art ANI cannot compare to a human brain. Lidars, time of flight sensors, radars, infrared cameras, ultrasound sensors, car to car communication, traffic information, etc.. Each different type of sensors reduces the chance that a pedestrian is misclassified as a bird, and that a electric pole is not detected due to cameras being blinded by low sun.

Link to comment
Share on other sites

Link to post
Share on other sites

14 hours ago, Kisai said:

LIDAR creates a depth map.

 

Stereo cameras "can't move" like human eyes do. I have a stereo webcam and it can NOT see behind anything even though it has depth. Your eyes can move and thus see the parallax. 

 

LIDAR is "Light Detection and Ranging", and basically is just a constantly running depth scanner. Cameras are easily blinded by light, and do not work in darkness. RADAR works in darkness. LiDAR works in Darkness.

 

Ok, so the argument for Lidar & 2 human eyes is that they can make a depth map and camera(s) can't? From what I've seen so far, Teslas seem to make pretty good depth maps with their cameras already.
On the 'blinded by light and don't work in darkness'? Human eyes get blinded by light and don't work very well in darkness either. Cameras actually can have much better light sensitivity then the human eye.
This said, I won't argue that Lidar and/or radar could not improve an autonomous systems capabilities. I would however argue that none of the points made earlier make it impossible for a camera only based system to be a better (less prone to lethal accidents) system then the average human driver.
Humans are actually proven to be incapable to consistently avoid accidents in dense urban environments at speeds over 30km/h.

Link to comment
Share on other sites

Link to post
Share on other sites

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now

×