Instead of sensing photons and processing the results, why not process the photons?
See full article...
See full article...
I believe you meant charges rather than chargers. Emphasis mine.That’s just the time needed for a camera to transform the photons hitting its aperture into electrical chargers using either CMOS or CCD sensors.
Faster is usually better, but at what point is it fast enough? Classifying in milliseconds rather than picoseconds is still way faster than human reflexes.A photonic image/video/vision recognition system would be a huge step forward for autonomous moving systems. Your robot car could see and categorize a potentially dangerous object up ahead in nanoseconds. Having the evasive maneuver take a few more milliseconds is fine.
...
Faster is usually better, but at what point is it fast enough? Classifying in milliseconds rather than picoseconds is still way faster than human reflexes.
Well, sure. However, a system should be continuously imaging and evaluating. Faster is better, but as quantified in the article, we are talking a 20ms delay. The human brain is also not instantly determining what to do and our visual lag is approximately 20ms. Our actual reflex time is on the order of 200ms. And actual input to vehicle time is approximately 500ms when we are POISED to take action and are simply waiting.I think the comparison to human reflexes isn't the right one to be making. If information can be processed in picoseconds, that gives an autonomous system more time to evaluate different options for avoiding a problem. Or evaluating multiple sensor inputs to determine if there's a problem in the first place.
Given the prevalence of problem like phantom braking or plowing into parked cop cars, there may be no "fast enough".
I think the objective is to build a system which responds similarly to the human "spinal reflex". The classic human example is if you touch something painful, your arm will jerk back before the full neural signal ever reaches the brain to trigger a decision/action. Instead, the reflex is triggered in the spinal column for an immediate self-preservation action, so that by the time the brain receives the signal, your hand is already "safe" and your brain can direct further conscious action.Well, sure. However, a system should be continuously imaging and evaluating. Faster is better, but as quantified in the article, we are talking a 20ms delay. The human brain is also not instantly determining what to do and our visual lag is approximately 20ms. Our actual reflex time is on the order of 200ms. And actual input to vehicle time is approximately 500ms when we are POISED to take action and are simply waiting.
When we are less alert (as in almost always) our reflex time is closer to a second, especially if you are talking braking and you have to move your foot.
Faster IS better. Every 11.36ms is one foot you travel at 50mph, so yes, 20ms can be the difference between missing something and a collision. That being said, vehicle systems can see a lot more and react drastically faster than humans already. What is MOST important is seeing the CORRECT thing as well as taking the CORRECT action. That, autonomous vehicle systems still lag pretty far behind humans.
So at least TODAY, the most important thing is correctly processing signals and coming to the correct decision, not the speed at which it can process the input signal. And unless the photonic signal processors can do all of the processing on chip, it is still going to have to offload to something with significantly more processing power for image interpretation, decision making, and vehicle control.
Don't get me wrong, this is cool and there are certainly applications. But this is a very far from replacing what there currently is and is perhaps chasing the wrong problem to solve. As to your last, that isn't a speed of signal processing issue, that is data interpretation issue, which is what the struggle still is. Humans and human vision ar still drastically better at that than what we have come up with for on vehicle systems.
Well, most 2nd order nonlinear optical crystals (like ktp) are incompatible with standard lithographic technologies. Assuming (I’ve not read the paper yet) it’s InP based, then you only have third order processes available. High confinement can make them efficient, but they are really hard to controlI'm sure the authors are aware of the field of nonlinear optics. I would have loved a discussion about why something like frequency doubling of a 1064 nm laser beam with KDP crystals wasn't considered or failed its implementation.
Curious why you think the harmonic will do better than the fundamental line? For some applications, the shorter wavelength is an advantage. At this stage of the research, having more bandwidth certainly isn’t going to improve anything.I'm sure the authors are aware of the field of nonlinear optics. I would have loved a discussion about why something like frequency doubling of a 1064 nm laser beam with KDP crystals wasn't considered or failed its implementation.
Doubling the frequency doesn't strike me as useful for an activation function -- frankly, despite the link text, doubling sounds like a linear operation. In software, the simplest activation function is to convert all the negative values to zeros -- you need something that causes a bend in the curve, not that simply replaces the curve with another curve of a different slope.I'm sure the authors are aware of the field of nonlinear optics. I would have loved a discussion about why something like frequency doubling of a 1064 nm laser beam with KDP crystals wasn't considered or failed its implementation.
Frequency doubling is also a multiplication operationCurious why you think the harmonic will do better than the fundamental line? For some applications, the shorter wavelength is an advantage. At this stage of the research, having more bandwidth certainly isn’t going to improve anything.
Ah, excellent point on the lithography connection. That would be enough explanation.Well, most 2nd order nonlinear optical crystals (like ktp) are incompatible with standard lithographic technologies. Assuming (I’ve not read the paper yet) it’s InP based, then you only have third order processes available. High confinement can make them efficient, but they are really hard to control
Just read that is based on Si-on-insulator technology, so third order is all you have (and, a fairly small chi3, compared to InP from memory)
It's not AI. It's analog optical processing. The advance is the ability to do a lot of layers of optical processing in an integrated sensor array.This is why AI is amazing
Curious why you think the harmonic will do better than the fundamental line? For some applications, the shorter wavelength is an advantage. At this stage of the research, having more bandwidth certainly isn’t going to improve anything.
To answer both of you, the fraction of the fundamental line that is converted to 532 is a nonlinear function of the intensity. The nonlinear function between layers in a neural net need not be a threshold function. It can be just about any nonlinear function. So one would do the linear algebra with the 1064 line then use the doubling to 532 as the nonlinear functions between layers. Given that you want to keep working with the 1064 lines, however, you'd probably not want to use the green light for your next step. You'd want to stick with the IR. So You might use the conversion to green as a nonlinear loss function.Doubling the frequency doesn't strike me as useful for an activation function -- frankly, despite the link text, doubling sounds like a linear operation. In software, the simplest activation function is to convert all the negative values to zeros -- you need something that causes a bend in the curve, not that simply replaces the curve with another curve of a different slope.
Is there an optical equivalent of a diode?
With the important distinction that there's a non-linear bit of processing included. Linear operations with light are quite simple: every interferometer is doing that.It's not AI. It's analog optical processing. The advance is the ability to do a lot of layers of optical processing in an integrated sensor array.
There is at least one I know of, but it's like a backwards Zener diode. If you focus to a spot and ionize the air, the rest of the pulse can't get through the plasma ball. So it's a diode that cuts out at a given intensity.Is there an optical equivalent of a diode?
Well, sure. However, a system should be continuously imaging and evaluating. Faster is better, but as quantified in the article, we are talking a 20ms delay. The human brain is also not instantly determining what to do and our visual lag is approximately 20ms. Our actual reflex time is on the order of 200ms. And actual input to vehicle time is approximately 500ms when we are POISED to take action and are simply waiting.
When we are less alert (as in almost always) our reflex time is closer to a second, especially if you are talking braking and you have to move your foot.
Faster IS better. Every 11.36ms is one foot you travel at 50mph, so yes, 20ms can be the difference between missing something and a collision. That being said, vehicle systems can see a lot more and react drastically faster than humans already. What is MOST important is seeing the CORRECT thing as well as taking the CORRECT action. That, autonomous vehicle systems still lag pretty far behind humans.
So at least TODAY, the most important thing is correctly processing signals and coming to the correct decision, not the speed at which it can process the input signal. And unless the photonic signal processors can do all of the processing on chip, it is still going to have to offload to something with significantly more processing power for image interpretation, decision making, and vehicle control.
Don't get me wrong, this is cool and there are certainly applications. But this is a very far from replacing what there currently is and is perhaps chasing the wrong problem to solve. As to your last, that isn't a speed of signal processing issue, that is data interpretation issue, which is what the struggle still is. Humans and human vision ar still drastically better at that than what we have come up with for on vehicle systems.
The processing is done in silica so it's just as robust as any other electronics in your car. Creating the photons needs some fiber optics and gratings, but that can be relatively robust.I know nothing about their fabrication, but are these kinds of photonic processing units durable enough to be put in stuff like phones or cars? Working with light just seems like an inherently more fragile approach.
A faraday isolator is also an optical diode but that is even harder to integrate in a pic than frequency doublingThere is at least one I know of, but it's like a backwards Zener diode. If you focus to a spot and ionize the air, the rest of the pulse can't get through the plasma ball. So it's a diode that cuts out at a given intensity.
This statement is not true, and I’m curious what it’s supposed to say? As someone also noted above, 4Ghz clock ticks in 250ps. Not to imply a CPU is processing “a complete DNN” in that time, but a better comparison is needed.The team that implemented a complete deep neural network on a photonic chip, achieving a latency of 410 picoseconds. To put that in perspective, Bandyopadhyay’s chip could process the entire neural net it had onboard around 58 times within a single tick of the 4 GHz clock on a standard CPU.
That’s not exactly fair. Current AI consists of repeated matrix operations followed by repeated nonlinear operations. This chip demonstrates exactly that, but at a much smaller scale than modern AIIt's not AI. It's analog optical processing. The advance is the ability to do a lot of layers of optical processing in an integrated sensor array.
While the numbers might be true, the comparison isn’t meaningful. Your inverter uses x joules per bit. The equivalent optical inverter uses y joules per operation, but the number of bits encoded in the operation depends on how accurately we decide to measure the light amplitude. So a 1 bit inverter uses exactly the same power as a 100 bit inverter.A 4GHz clock ticks every 250ps.
An inverter uses about 50 attoJ to switch. An 8 bit multiply and add needs around 20 femtoJ.
Optical computing is interesting, but the bar to beat is very high.
Right but here you bypass the (imposed) camera refresh and the ECU processing rate restrictions. At least on certain scenarios.FWIW, most commercial car-mountable lidar units run at 10 Hz. This is a hard limit because the lidar array has to physically spin around in a circle, so it’s only going to move so fast. Cameras are more flexible but a sample rate of 10-15 Hz is pretty standard just to ease compute pressure on the ECU, and I’m not aware of anyone even sniffing 30 Hz much less higher.
At 10 Hz, you have 100 ms to process each “frame” (LiDAR are weird) before you start to perceive the next one. The signal processing latency is peanuts by comparison.
Also let’s remember that most humans have reaction times in the 100-200ms range anyway, and our road safety margins are designed with this in mind. I find it very unlikely that cutting out the ISP and operating directly on the optical feed will result in latency improvements that anyone is willing to pay for.
It is cool though!
For active pulse imaging eg LiDAR sure, but for natural light the world is awash in photons arriving continuously.The imaging example is a bad one. Most of the 20 ms is spent waiting for enough photons to arrive to make use of them. Trying to do that faster will yield lower detail like looking at a ray tracing scene lit with only a few rays.
However, there are plenty of control and/or optimization scenarios where having a shorter inference time would be very useful.
It’s been a number of years, the brain fades. I remember the use of an optical diode to prevent spatial hole burning in a laser we were developing. It was either a dye or a very early Ti:Sapphire. That was perhaps 40 years ago, or more. To put it into perspective , one of my team associates used to pass Heisenberg in the hallway at his previous position. I’m currently quite disconnected from current products and technologies. To be young again.A faraday isolator is also an optical diode but that is even harder to integrate in a pic than frequency doubling
Tesla claims that its FSD chip processes camera images at 2,300 frames/sec, on cars equipped with the HW3 hardware, first introduced in 2019 (!). It is not clear how many cameras are involved.FWIW, most commercial car-mountable lidar units run at 10 Hz. This is a hard limit because the lidar array has to physically spin around in a circle, so it’s only going to move so fast. Cameras are more flexible but a sample rate of 10-15 Hz is pretty standard just to ease compute pressure on the ECU, and I’m not aware of anyone even sniffing 30 Hz much less higher.