Artificial Intelligence & Robotics Tech News For October 2022

Artificial Intelligence & Robotics Tech News For October 2022

Show Video

robot dog learns soccer skills researchers from  the hybrid robotics Lab at the University of   California Berkeley have trained a robot  dog to be a soccer goalkeeper which has a   better shot blocking rate than Premier  League human players the robot dog can   squat sidestep jump and die to block the soccer  ball and then return to its starting position   with a blocking rate of 87.5 percent it greatly  exceeds the human average of professional soccer   goalies which is just 69 percent the robot dog  was trained via reinforcement learning which is an   aspect of machine learning that allows artificial  intelligence to learn by trial and error using the   feedback received from their actions researchers  have detailed the process in the form of a   preliminary paper titled creating a dynamic  quadrupedal robotic goalkeeper with reinforcement   learning teaching a quadruped robot the task  of being a soccer goalie involved many complex   problems including combinations of Highly fluid  Locomotion with quick and accurate non-prehensile   object manipulation to control the soccer ball  the mini cheetah which was the nickname given   by the biomimetic robotics laboratory at MIT  had to block the ball with Dynamic Locomotion   movements within less than a second with  a weight of only 20 pounds the tiny agile   robot dog can run Squat and even do backflips or  strafe sideways like a crab if it is knocked down   it can swiftly return to its original position  with a quick Kung fu-like movement of its elbows   according to MIT the mini cheetah is the first  robot dog that can do a reverse flip and trot   across uneven terrain nearly twice the speed of  a normal person's walk the soccer goal that the   robot was tasked to defend measured five feet  by three inches and was situated approximately   14 feet from the ball mini cheetah does not  have a camera therefore the ball's location   is determined through an external camera and YOLO  which is a popular and lightweight computer vision   algorithm that uses a convolutional neural  network to perform real-time object detection   the dog is subject to a variety of kicks and  froze from humans as well as shots from a   second quadruped robot dog that was developed by  unitary robotics while the experts concentrated   exclusively on goalkeeping the framework could  extend to different situations like multi-scale   ball kicking researchers have said making robots  that could compete with human soccer players is   one of their long-term goals currently there  is already a separate yearly event called the   RoboCup competition which is a robot versus robot  soccer competition that has been going since 1997.   breakthrough humanoid robotics technology clone  robotics has gone to extraordinary lengths to   ensure that humanoid robotics will have some  of the most realistic hands available on the   market with hydrostatic muscles that move under  transparent skin with an uncanny resemblance   to a real human appendage clone robotics claims  to have created the first biomimetic hand named   model number v15 which is capable of grasping  all kinds of objects regardless of shape from   tennis balls to suitcases to weights and more the  robotic Ken's thumb and fingers are designed to be   extremely realistic in terms of both appearance  and operation and the internal muscles have been   created by the team based on the concept of a  McKibben muscle muscles are basically a mesh of   tubes with balloons inside when a balloon's radius  increases which is usually caused by a Pneumatic   or hydraulic pump that is external to the muscle  the mesh is then forced to shrink longitudinally   clone robotics prefers not to utilize large pumps  externally instead their aimless to design muscles   that could be stimulated electronically to  constrict with a certain degree of control   with this design goal in mind the team devised  a way of filling the balloon with acetaldehyde   while using electric currents to activate the  muscles when the electric current is applied   the astable to hide in the balloons quickly  begins to boil thereby raising the atmospheric   pressure at just 68 degrees Fahrenheit which is  up 6.6 Times Higher at 158 degrees Fahrenheit   clone robotics then constructed the skeleton with  a set of human-like bones using hinged joints   which allowed for the mobility in the robotic  hand that matches that of a real human hand this   resulted in the robot arm having approximately  27 degrees of freedom which is similar to human   hands with integrated hand and forearm rotations  each of the movements from the robotic arm is   controlled by a complicated network of tendons and  muscles which extend along the forearm and hand   with the Prototype that is currently in  development Clone robotics is using a basic   hydraulic setup to activate the muscles with  pressure being distributed via a 500 watt pump   running at 145 PSI all through a set of 36 Electro  hydraulic valves with pressure gauges on each   the built-in magnetic sensors relay information  back to the onboard artificial intelligence   which then adjusts the joint's inner angles  as well as velocities the company plans to   start selling its robot Hands by the end of this  year but the exact cost has not yet been revealed   clone robotics next product will be a full robotic  torso with a spine including 124 muscles that are   located in the hands neck shoulders chest and the  upper back which will be designed for integration   into the company's Locomotion platform to carry  the unit's battery pack new Google AI turns text   into high resolution videos Google brain presents  imagine video which is a text conditional video   generation tool based on a series of video  diffusion artificial intelligence models   recent advances in AI have made it possible  to make incredible progress in the field of   generative modeling particularly with text to  image models image and video advances text to   video AI systems with another step forward in the  evolution of generative AI modeling capabilities   video generative models can have a positive  impact on society by helping to increase and   enhance human creativity and while they can also  be misused to create harmful hateful explicit or   inappropriate content these concerns have been  minimized by multiple measures for example in   the internal trials input text prompt filtering  and output video content filtering was employed   to prevent the creation of such content although  the internal testing suggests that most explicit   and violent content can still be removed there are  still social biases and stereotypes that can be   difficult to filter meaning there are still many  ethical and safety issues ahead Google's new image   and text-to-video model works like this it first  takes an input text prompt to then encode it with   a T5 text encoder into textual embeddings then  runs it through cascaded diffusion models The Base   Video diffusion model generates a 16 frame video  with a 24 by 48 pixel resolution at 3 frames per   second which is followed by multiple temporal  super resolution and spatial super resolution   m models these models then up sample all the data  to create a final 128 frame video with a 1280 by   768 pixel resolution at 24 frames per second which  results in a 5.3 second long high definition video   Imogen video creates high definition videos  by using a base video generator model and a   series of interleaved spatial temporal  and super resolution video AI models   the system is a scaled up high definition text  to video model which includes design decisions   such as choosing fully convolutional  spatial and temporal super resolution   models at specific resolutions and the  V parameterization of diffusion models   Google brain transferred previous findings on  diffusion-based image generation into the video   generation setting and for fast high quality  sampling they use Progressive distillation on   their video models with classifier-free guidance  imagine video is capable of producing videos   with High Fidelity and has high controllability  and World Knowledge image in video can generate   diverse videos and animations in different  Artistic Styles and can even understand 3D objects   in mentioned video also employs the video  you net architecture for spatial Fidelity   and temporal Dynamics temporal self-attention  is used in The Base Video diffusion model while   temporal convolutions are used in the temporal  and spatial super resolution models this allows   imagine video to model long-term temporal  Dynamics using the video unit architecture   image and video and its T5 XXL Frozen text encoder  were trained using problematic data and the Google   brain AI team will not release the image and video  source code or the model until the aforementioned   ethical and safety concerns have been addressed  new Dolly powered robotics from open AI this is   the first attempt to develop web scale diffusion  artificial intelligence models of robotics   the dollar bot from open a allows a robot arm to  rearrange objects within a scene by inferring a   description of the objects then creating an  image that represents a natural human-like   Arrangement and finally physically placing the  objects in accordance with the target image   this is significant because it can achieve this  zero shot using doll e without any additional   data collection or training this is  a promising direction for web scale   robot learning algorithms because the results  from Human studies prove that it is possible   to align future developments of these models with  robotics applications the researchers have also   proposed a list of recommendations to the text to  image community one of the most significant recent   advances in machine learning has been web scale  image diffusion models such as DALLE-2 from OpenAI   by training over hundreds of millions of image  caption pairs from the web these models learn   a language condition distribution over natural  images from Which novel images can be generated   given a text prompt large language AI models  which are also trained on web scale data were   recently applied to robotics to enable  generalization of language-conditioned   policies to novel language commands given these  successes researchers from open AI wanted to see   if web scale text to image diffusion models such  as Dolly can be exploited for real world robotics   since these models can generate realistic images  of scenes they must also understand how to arrange   objects in a natural way for example generating  images with a kitchen prompt would be more likely   to display plates neatly placed on a table or  a shelf this is a clear application to robotics   to predict goal States for object rearrangement  tasks which is a canonical challenge in robotics   manually aligning goal states with human values  is brittle and cumbersome so this is where web   scale learning offers a solution to implicitly  Model Natural distributions of objects in a   scalable unsupervised manner the research proposes  dollybot as the first method to explore web scale   image diffusion models for robotics a framework  was designed which enables Dolly to be used to   predict a goal state for object rearrangement  given an image of an initial disorganized scene   the pipeline converts the initial image into  a text caption which is then passed into Dolly   to generate a new image from which the AI then  obtains gold poses for each object to note the   publicly available dolly was used as it is without  requiring any further data collection or training   this is important because it allows for zero shot  autonomous rearrangement going Beyond prior work   which often requires collecting examples of  desirable arrangements and training a model   specifically for those scenes Elon Musk reveals  the Tesla Optimus AI humanoid robot while Elon   Musk showed off Tesla's humanoid AI robot for  the first time he stated that the robot will   pose a significant contribution towards progress  in artificial intelligence autonomous driving and   artificial general intelligence also known as  AGI mus said the Tesla Optimus robot's current   iteration has the ability to move all of its  fingers independently with thumbs that move with   two degrees of freedom allowing it to carry out  a myriad of real world tasks the robot is being   designed to be produced at high volumes with a  price point of less than twenty thousand dollars   Tesla's humanoid robot features 28 structural  actuators with 11 degrees of freedom in its   hands and supports Wi-Fi and LTE running at 52  volts the team intends to reduce the cost of the   robot and increase its efficiency over time in  order to enable production at scale by reducing   their part count and power consumption to achieve  this they will minimize the wire count in the arms   and locate its compute and power distribution  Hardware to the physical center of the robot   in the middle of the robot's torso will be the  battery pack which will hold 2.3 kilowatt hours to   complete roughly eight hours of work on a single  charge the battery pack has a single print circuit   board within the pack to take care of sensing  charge management and power distribution Tesla   plans to use its supply chain autopilot software  and hardware for the Tesla Optimus robot this will   allow the robot to process Vision data make Split  Second decisions using multi-sensory inputs and   Carry Out Communications including audio support  and Hardware level security to protect both the   robot and humans in its vicinity the robot  employs a form of Tesla's autopilot using its   neural networks for computer vision to achieve  volumetric depth rendering of objects as well   as make an estimation of its local environment to  navigate towards its charging station for instance   the robots Locomotion planning is carried out in  three different stages first by calculating its   desired path then planning its footsteps across  that path and finally by calculating the reference   trajectory using State estimation and sensors  the robot tracks its Central Mass trajectory in   order to give the robot the ability to walk  steadily despite Dynamic changes in terrain   or balancing scenarios in terms of the robot  being able to manipulate objects in the real   world naturally they first generated a library  of natural motion references using motion capture   and then adapted those motions to current real  world situations such as picking up an object   in order to generalize variations of motions in  the real world depending on object location they   ran the various reference trajectories through  a reference trajectory optimization program   to sulfur where the hands move and how the robot  should balance while adapting to a given situation   the Tesla robot's hands are ergonomically  designed to grasp objects like a human hand   would in order to easily and efficiently  function inside of a factory environment   the hands have sensory feedback and adaptability  for new objects with six actuators 11 degrees of   freedom and non-back drivable fingers  with the power to carry a 20 pound bag   the robot's actuators come in six Unique Designs  across 28 separate joints including three rotary   actuators and three linear actuators with multiple  sensors all with extremely impressive output force   in fact the actuators are so powerful that  they can lift a half ton grand piano which   is a requirement as human muscles are also  capable of this similar Force the structural   foundation of the robot is extremely strong  being built to easily weather Falls without   breaking actuators or requiring expensive  maintenance so it can continue working as normal   they'd have run the model through extensive  simulations using Tesla's crash software to   create stresses in all the components to  design them to be as resilient as possible   they even designed the robot's joints after human  joints to similarly be able to generate maximized   Torque from a bent position while minimizing  force in order to conserve energy consumption   overall Tesla is off to an impressive start with  the Optimus robot but it says they are going to   begin optimizing the robot for real world tests  in their factories to automate increasing numbers   of operating processes over the next months and  years so far Tesla's Optimus robot features a   level of artificial intelligence that currently  isn't present in other leading robotics companies   such as Boston Dynamics furthermore Elon Musk  mentioned that there is a level of democracy   involved with the robot and its overall operations  as Tesla shareholders can vote with their shares   to alter the trajectory of the robot and its  design or implementation the Tesla robot is   expected to provide economic output capable of  reaching two orders of magnitude more than that   of humans currently but there is still a great  deal of work ahead as the model shown is only   the first iteration of the robot Tesla's ultimate  goal is to make so many Optimus robots that the   global economy becomes quasi-infinite leading  to a future of abundance and abolishing poverty   because of the ability of these humanoid robots  to perform work tasks in the place of humans they   could quickly start saving humans from having  to perform dangerous or back-breaking tasks   machine learning driven exoskeleton researchers  from the Stanford biomechatronics laboratory have   created machine learning powered exoskeleton  boots which are robotics built in labs and   allowers to move faster and easier while using  less energy the robotic boots have a motor   that is connected to the muscles in the calf to  provide the wearer with a boost for every step   unlike other exoskeletons out there this push  is personalized thanks to a machine learning   based AI model that was trained through years of  work using emulators on a treadmill the device   provides Twice The Energy savings of previous  exoskeletons in the real world this translates   to significant Energy savings and walking speed  improvements the main goal is to Aid people with   mobility issues to move around the world in the  way they prefer especially those who are older   with this breakthrough researchers believe this  technology can be commercialized and miniaturized   within the next few years the first time the  user puts on an exoskeleton can be a bit of an   adjustment but after about 15 minutes of walking  it becomes quite natural the exoskeletons movement   is like having an extra bounce in your stride and  provides a noticeable increase in comfortability   the biggest obstacle to successful exoskeletons  in the past was the need for individualization   most exoskeletons were designed using a  combination of Intuition or biomimicry but   people's sizes and movements are too complicated  and diverse to tackle the issue the research team   used a large exoskeleton emulator lab setup  which quickly determined the best ways to   help individuals and uncover the blueprints for  Effective portable devices to be used in the field   when students and volunteers were connected  to the emulators researchers recorded their   energy expenditure and motion data to determine  how walking with the exoskeleton affected the   user's energy output the resulting data showed  the different advantages of various types of Aid   provided through the simulator and was fed into  a machine learning model to individually optimize   the exoskeleton for each user the exoskeleton can  speed up the pace of walking by applying torque   to the ankle and replaces certain functions  of the calf muscles when users take a step in   the direction of their toes right before they  are poised to walk the device assists them in   pushing their foot off the ground the artificial  intelligence component then Tunes the exoskeleton   to provide a slightly different walking  pattern depending on the user's size and speed   when it is able to measure the movement the  machine learning model determines the best way to   Aid the user each time they step the researchers  discovered that the exoskeleton allowed users to   walk nine percent faster and with the use of  17 percent less energy for each mile traveled   new Google AI generates 3D models from text  dreamfusion is an evolution of dreamfield's   generative 3D AI system which Google revealed in  the latter part of 2021 Google combined the image   analysis model called clip from open a with neural  Radiance Fields also known as Nerf to create dream   Fields allowing the neural network to store the 3D  models dream Fields uses nurse ability to generate   3D views and combines it with clip's ability  to assess content from images upon receiving   the input text an untrained Nerf neural network  model generates random views from one Viewpoint   which are then evaluated by clip the feedback  serves as a correction signal to the Nerf neural   network model and the process is repeated until  there is a 3D model that matches the description   dreamfusion further refines this process with  its artificial intelligence model dreamfusion   uses Google's pre-trained 2D image diffusion AI  model called imagine to perform text 3D synthesis   for dream Fusion Google is planning on replacing  clip from open AI which can also be used for 3D   generation with a new loss based on its own image  and AI model which Google AI stated could enable   many new applications of pre-trained diffusion  models 3D generation doesn't require 3D data for   training because it wouldn't be available at  the required scale dreamfusion instead learns   to represent 3D objects using 2D images from  Imogen generated from different perspectives   this is done using gaze dependent prompts  like front view and rear view for which the   entire process is automated by the artificial  intelligence dream Fusion produces reliable 3D   objects that are more detailed with more quality  and depth than dream fields and it also allows   users to combine multiple 3D models into one  scene as well as generate normals based from a   text input the Google AI research team wrote that  their approach doesn't require any 3D training   data or modifications to the image diffusion  modeling which demonstrates the effectiveness of   pre-trained image diffusion models as priors users  can export the Nerf models that they generate into   meshes with the marching cues algorithm this  allows them to be easily integrated into popular   3D rendering software or modeling software  it is expected that Google's dream Fusion   artificial intelligence model could prove to be an  extremely useful tool for users in the metaverse   this tool would serve non-technical users  who want to create structures or objects   to populate their simulated Worlds by allowing  them to Simply type a description of what they   Envision in the chat box to create it in a  Flash breakthrough Tesla AI supercomputer   Elon musk's Tesla will provide the equivalent  computing power of what currently requires 4   000 Nvidia gpus and over 72 racks with just four  of their Tesla made Dojo supercomputer cabinets   each cabinet has two assemblies inside with an  assembly having 61 tiles of 354 densely packed   cores per tile each capable of 54 petaflops of  compute and 640 gigabytes of high bandwidth memory   every Dojo accelerator will have a total of  1.1 exaflops of machine learning compute power  

1.3 terabytes of SRAM and 13 terabytes of dram  Dojo promises to dramatically speed up the rate   at which models can be trained to improve Tesla  also has made similar Promises of performance for   other types of work that involve creating Ai and  machine learning models for autonomous vehicles   they will deploy the dojo in clusters referred  to as exopods which consist of 10 cabinets each   with each exopod able to achieve 1.1 exaflops  of machine learning computing power the combined   total of the data center will be 8.8 exaflops  of processing power primarily being devoted to   training artificial intelligence algorithms for  Tesla's self-driving vehicles and their Optimus   humanoid robot for reference 8.8 exaflops  is about eight times faster than the world's  

fastest supercomputer which is named Frontier  the method by which Dojo operates is different   from GPU and cpu-based AI supercomputers Dojo  is a different kind of supercomputer because   it is composed of tiles which take an entirely  different approach than regular gpus and CPUs   modern gpus are equipped with many thousands of  cores each the newly released Nvidia GeForce RTX   4090 comes with over 16 000 cores while the gpus  in Tesla's previous Nvidia supercomputer each have   over 6900 cores with a total of over 40 million  GPU cores throughout the entire supercomputer   gpus are gaining popularity for  training artificial intelligence   and machine learning applications like those  used in developing Tesla self-driving systems   CPUs currently contain up to 64 cores with each  node offering a maximum capacity of two CPUs   and 128 cores a cpu-based AI supercomputer  can combine a number of these nodes into   one system like the previously mentioned Frontier  supercomputer which uses 9400 nodes with over 600   000 cores Dojo is unique because instead of  combining many smaller chips like traditional   approaches to Super Computing Tesla's D1  tile is one huge chip that has 354 cores   that are specially designed for artificial  intelligence and machine learning tasks   six of these cores are placed in a trade together  with other hardware for supporting computing two   trays can be put into a single cabinet which gives  the cabinet a total of four thousand two hundred   and forty eight cores and a 10 cabinet exopod  has a total of 42 480 cores because the dojo   supercomputer is specially designed to process Ai  and machine learning tasks it's several times more   efficient than GPU and cpu-based supercomputers  with the same data center footprint and Tesla   plans to deploy their first dojo exopod in 2023  at their Palo Alto data center record-breaking   humanoid underwater drone robotics with haptic  feedback allow users to feel with their hands   ocean one has swung through sunken ships planes  as well as a submarine and plunged down to a depth   of nearly one kilometer with its unique features  that enable its users to feel as though they are   physically interacting with the Underwater World  by providing sensory feedback to their hands   the underwater drone humanoid robot designed by  Mika Robotics and Stanford experts physically   Shields people from potentially dangerous and  inaccessible underwater environments while linking   their abilities knowledge experience and intuition  to the task at hand ocean one's top half is a   humanoid robot with two robotic arms two cameras  in the face for operators to see in 3d and its   lower half has eight multi-directional thrusters  which enable precise underwater maneuvering the   underwater robots haptic feedback system relies  on touch and stereo Vision to create an incredibly   real Sensations for the operator to allow them  to feel like they're at the bottom of the ocean   through the ocean one's robotic eyes the  human operators can see the environment in   high definition ocean one serves two goals to  discover areas that no one else has explored   before and to demonstrate that human touch vision  and interaction can reach locations that were   previously far removed from us while ocean one had  numerous memorable adventures and accomplishments   during two long distance trips across the  Mediterranean the most notable achievement   of its crew was the demonstration of operational  autonomy for more than one thousand meters down   it marked the first time ever that underwater  robots were capable of providing a haptic   feedback based interaction at this depth the  ocean one robot's Journey to the one kilometer   Mark was an extended effort that started with  countless hours of design experimentation and   assembly with team members in the lab dozens  of trips to the Stanford pool for debugging   and a myriad of lessons to be learned before  facing the unpredictability of the deep sea   the first version of this underwater drone  humanoid robot was built with the intention   of being able to reach depths of only 200 meters  instead of one kilometer so to allow the robot   to go deeper the scientists adapted the body  to incorporate a special foam made from glass   microspheres these glass microspheres provide  buoyancy and are strong enough to withstand   the mass of pressure at one kilometer depths  which is 100 times more than that of sea level   the robotic arms contain an oil-filled spring  mechanism that compresses the oil to match the   pressure outside to prevent collapse or cushioning  of the electronic components the researchers also   revise several tiny components across ocean  one to limit the amount of compressed air   within individual Parts while keeping the robot  as small as possible ocean one included additional   improvements that increased the flexibility of  its head and arm motions as well as two different   types of hands the ocean one project is not just  a showcase for the latest Innovations in haptic   feedback technology underwater drone Robotics  and human robot interaction but also offers new   possibilities for underwater engineering and  Marine science activities including checking   and fixing boats and infrastructure such as  Bridges piers and pipelines that are submerged   other Expeditions are planned for several  locations across the globe such as lost cities   in deep Waters coral reefs and archaeologically  valuable wrecks that lie at depths that are   beyond the reach of human beings these kinds of  humanoid underwater drone robotics can go deep   underwater to look for and acquire materials build  infrastructure and conduct emergency recovery or   disaster prevention operations but in the future  they can also be adapted to operate deep inside   mines on top of mountains or even in space break  through Google AI text-based image editing model   Hi-Fi image manipulation using text input is  a long-standing problem in computer Graphics   research using text commands to describe  an edit like men wearing suits or pixel art   is significantly easier than carrying out the  changes manually in an image editing software   text to image models like doll e Imaging and  stable diffusion are proficient when creating   images from scratch but for image editing  tasks these models usually need the user   to specify masks and often struggle with edits  that depend on the masked portion of the image   Google AI has developed unitune as a method to  edit images by providing a textual description   of the desired result preserving High Fidelity  to the entirety of the input image including the   edited portions accuracy is preserved both  to visual details such as shapes colors and   textures and to semantic details including  actions poses and objects unitune is able   to edit arbitrary images in complex cross-domain  scenes and can perform localized and Global edits   it is unique in its ability to perform stylistic  changes on an image-wide basis without sacrificing   semantic details and can complete complex  local edits in their logical locations   unit 2 uses large-scale text image diffusion  AI models to execute expressive image editing   and Google AR researchers found that by  using the right parameters fine-tuning   large diffusion models on a single image prompt  pair doesn't result in catastrophic forgetting   furthermore the semantic and visual knowledge  that the AI model learned from its training   is still usable across a very wide variety of  edit operations by simply using classifier-free   guidance the accuracy expressiveness balance can  be tuned by controlling the number of training   steps and learning rate or the amount  of classifier-free guidance and SD edit   fine tuning of diffusion models is a powerful  technique relevant to many use cases like   image to image translation and topic driven image  generation by minimizing overfitting at training   time this technique is able to learn the essence  of a subject without learning transient image   specific attributes such as pose camera angle or  background the Google AI researchers say that unit   tune is the first method to use fine-tuning of  a large diffusion model for image editing tasks   new deep learning technology breakthrough uses  light waves MIT researchers have discovered a   new approach that makes use of Optics to speed up  machine learning computations using smart speakers   and other low power devices the technique shifts  the memory intensive steps involved in running a   machine learning algorithm into components encoded  onto light waves the light waves are then sent   through a fiber optic device with a receiver that  uses an ordinary Optical device to quickly execute   calculations using the components of the machine  learning model that are carried by the light waves   this method can result in more than a 100 times  Energy Efficiency improvement over other methods   while also strengthening security since the user's  personal data doesn't go through a centralized   server this could allow autonomous vehicles to  make decisions in real time and use only a tiny   amount of energy compared to what current energy  intensive systems require it can also enable an   uninterrupted latency free chat using a smart  home device or be used for live video processing   via wireless networks or even allow high-speed  image classification on a spacecraft that is   thousands of miles away from Earth the neural  network designed the researchers created is   named netcast and it involves storing weights in  a server that's connected to an Innovative device   called a smart transceiver the smart transceiver  is a small chip that transmits and receives data   and uses silicon photonics to retrieve trillions  of weights from memory every second by reading the   weights from electrical signals and printing them  onto light waves because the weight information   is encoded in bits of ones and zeros it converts  them by turning on the lasers for the value of   one and turning them off for the value of zero it  then periodically transmits groups of light waves   through the fiber optic Network to reduce latency  and prevent the need to request from the server   after the light waves have arrived to the  client a Broadband modulator makes use of   the light waves by executing super fast analog  calculations and encodes the input information   that the client device receives such as sensor  data into machine learning weights lastly it   transmits each wavelength to a device that detects  light to measure the outcome of the calculations   the researchers came up with a method to make  use of this modulator to perform trillions of   multiplication computations every second which  greatly enhance the processing speed on the   device using an extremely small amount of energy  they evaluated this design using weights that   were transmitted across an 86 kilometer Fiber  Wire which connects their lab with MIT Lincoln   Laboratory the netcast neural network allowed  them to perform computer vision tasks with   extremely high Precision achieving an accuracy of  98.7 percent for image classification tasks and   98.8 percent in digit recognition tasks all with  a speed of up to 96 kilobytes per second as a Next   Step the researchers are looking to improve the  Chip's smart transceiver to maximize performance   and they intend to shrink the receiver to become  about one-third of its current size which is   as big as a shoebox right now this way it can be  incorporated into an electronic device as small as   a cell phone new general robotic arm manipulation  prompt-based learning has been recognized as an   efficient method for natural language processing  in which one general purpose model of language is   able to be instructed to execute any task  that is specified through input prompts   however task specification in robotics takes a  variety of types like a robotic arm performing   One-Shot demonstrations in imitation and following  instructions in a language to achieve visual goals   which are usually viewed as different jobs that  require specialized machine learning models   AI researchers from Nvidia Stanford Caltech  singhwa and UT Austin have shown a full range   of tasks that robots can perform using multimodal  prompts that interleave Visual and textual tokens   the artificial intelligence researchers have  designed an AI agent based on Transformer   neural networks a generalist robot and veinay  which processes prompt signals and then outputs   motor related actions in an auto-regressive  way to train and evaluate the AI model the   researchers developed a new simulation  Benchmark with thousands of procedurally   generated tabletop tasks with multimodal prompts  over 600 000 expert trajectories for imitation   learning and four levels of evaluation  protocol for systematic generalization   Vima is able to scale up in terms of AI  model capacity as well as the size of data   it is superior to previous methods in the most  difficult zero shot generalization settings by   as much as 2.9 times the performance rate  when using identical training data with 10  

times less training data Vima is still 2.7 times  more efficient than the next best alternative quantum computer breakthrough tunes qubits for a  programmable solid-state superconducting processor   scientists have managed to show for the first time  that large amounts of quantum bits also called   kubits are able to be tuned to interact with one  another and maintain coherence for an unimaginably   long period of time in a programmable solid-state  superconducting processor this breakthrough was   achieved through the efforts of researchers at  Arizona State University and XI Jiang University   in China together with two researchers in the UK  in a recent paper that was published researchers   have presented the first look at the development  of quantum mini body scurring states abbreviated   as qmbs as a powerful method of maintaining  coherence between the cubits that interact   these exotic Quantum states provide the  possibility of creating a vast multipartite   entanglement to be used for a range of quantum  Computing tasks to attain High processing speed   with low power consumption the paper titled many  body Hilbert space scarring on a superconducting   processor was published in the journal major  physics one of the researchers commented that   human vs States possess the intrinsic and generic  capability of multipartite entanglement making   them extremely appealing to Applications  such as Quantum sensing and neutralogy   the main focus of the study is understanding  how to delay thermalization in order to preserve   coherence which has been regarded as a crucial  research goal for Quantum Computing for some time in vitro neurons are able to learn and display the  ability to communicate when they're embodied in a   game World simulation integrating digital systems  with neurons could enable performance that is not   possible using silicon by itself researchers  have presented dish brain which harnesses the   inherent computational capabilities of  neurons within a structured environment   in vitro neural networks of rodent or human  Origins are integrated into in silico Computing   using an array of high-density multiple electrodes  by recording an electrophysiological stimulation   the culture is embedded into a simulated game  world that mimics an arcade version of pom   applying implications from the theory of active  inference using the free energy principle the   researchers observed apparent learning within  five minutes of real-time gameplay that wasn't   evident under control conditions more experiments  show the importance of closed-loop structured   feedback used to trigger the process of learning  the ability of cultures to self-organize their   activities in a purpose-driven manner when  confronted with a lack of data about the   consequences of their actions is what is  known as artificial biological intelligence   future applications could offer further insight  into the cell-based correlation of intelligence   utilizing the computing power of living neurons  to produce synthetic biological intelligence   which was once confined to the world of  science fiction could now be within reach   the advantages of biological computation have  been extensively researched with the aim to   create biomimetic devices capable of neuromorphic  computing instantiating synthetic biological   intelligence could bring about a paradigm shift  of silicobiological computational platforms which   exceed the performance of classical silicon  Hardware theoretically synthetic biological   intelligence could emerge prior to artificial  general intelligence also known as AGI due to   the inherent efficacy and evolutionary advantages  of biological systems breakthrough deepmind AI   discovers new matrices algorithms after publishing  their paper in nature the Google deepmind AI team   has introduced Alpha tensor which is the first  artificial intelligence system to discover novel   efficient and provably correct algorithms for  fundamental tasks like multiplying matrices   this answers a 50 year old question in mathematics  how to multiply two matrices the fastest   this paper represents a crucial step in Google  deepmind's mission to advance science by   unlocking the most fundamental problems  with the use of artificial intelligence   Alpha tensor is a system that builds  on Alpha zero which was an agent that   displayed remarkable performance in word  games like go and chess back in January 2016.   the paper from Google deepmind shows the  evolution of alpha Zero's Journey from   playing games to solving untackled math problems  for the first time deepmind explored how modern   artificial intelligence techniques can be used to  automatically discover new matrix multiplication   algorithms which is math that's used to process  images on smartphones recognize speech commands   generate graphics for computer games run  simulations to predict weather and much more   it also helps compress data and videos so  they can be more easily shared on the internet   Alpha tensor based on human intuition discovered  algorithms that are more efficient for many Matrix   sizes than the current state of the art the AI  designed algorithms outperform human designed   ones which represents a significant leap of  progress in the area of algorithmic discovery   the deepmind research team first converted the  problem of finding efficient algorithms for matrix   multiplication into an easy to learn single-player  game the board is a three-dimensional tensor   which is an array of numbers that shows  how far off the current algorithm is   the planner must use a set number of moves that  correspond to the algorithm's instructions to   modify the tensor and zero out its entries  if the player succeeds in doing so the matrix   multiplication algorithm is probably correct for  any pair and the efficiency of the algorithm is   then measured by how many steps it takes to zero  out the tensor this game is extremely challenging   because the number of possible algorithms to be  considered is greater than the number of atoms   in the universe even for small cases of matrix  multiplication the possibilities are Limitless   comparable to go which was a challenging AI game  for decades as the number of moves possible at   each step in the game is 30 times greater above 10  to the 33rd power depending on the setting chosen   to play this game well one must be able to find  the tiny needles among a vast array of options   to surmount the challenges of this domain  which are significantly different than those   of traditional games deepmind developed  multiple crucial components which include   a novel neural network AI architecture that  incorporates problem-specific inductive biases   a procedure to generate useful synthetic data and  a recipe to leverage symmetries of the problem   the deepmind team started their AI without any  knowledge of existing matrix multiplication   algorithms and they used reinforcement learning  to train the alpha tensor agent to play the game   Alpha tensor learns and improves with time  and eventually discovers historical fast   matrix multiplication algorithms like stressens  allowing it to surpass human intuition and uncover   algorithms much faster than before after enough  training Alpha tensor also uncovers a wide range   of algorithms with state-of-the-art complexity  with thousands of matrix multiplication algorithms   for every size showing that there is more to  matrix multiplication and was previously believed   these algorithms have many mathematical and  practical properties which deepmind took   advantage of by modifying Alpha tensor so that  they could find algorithms that run faster on   specific Hardware such as an Nvidia V100 GPU or  a Google tensor processing unit these algorithms   multiply large matrices 10 to 20 percent faster  than the more commonly used algorithms on the   same Hardware which demonstrated Alpha tensor's  flexibility to optimize arbitrary objectives   Google deepmind's results can be used to guide  future research in complexity Theory which is a   method that aims at identifying the most efficient  algorithms to solve computational problems   Alpha tensor helps to understand the richness and  efficiency of matrix multiplication algorithms   by allowing researchers to explore the  possible algorithms in a more efficient   way than previous approaches this space could  lead to new insights that will help deepmind   researchers to determine the asymptotic  complexity of matrix multiplication which   is one of the most fundamental open issues of  computer science because matrix multiplication   is a core component in many computational  tasks spanning computer Graphics digital   Communications neural network training and  scientific Computing Alpha tensor discovered   algorithms could make computations in these  fields significantly faster and more efficient   Alpha tensor's ability to take into account any  objective could lead to new applications for   Designing algorithms that optimize metrics such as  energy use and numerical stability this will help   prevent small rounding errors from snowballing  throughout the course of an algorithm's work   breakthrough make a video text to video AI  system from Meta Meta formerly known as Facebook   has unveiled its new artificial intelligence  system called make a video which allows people   to transform text prompts into short high quality  video clips make a video is a continuation of the   recent advances from meta in generative AI  research and has the potential to create new   opportunities for creators as well as artists the  new artificial intelligence model Builds on recent   advances in text to image generation technology  to facilitate their text to video generation   to learn about the world and its descriptions the  AI system uses images with descriptions and to   learn how the world moves it uses unlabeled video  by doing this make a video AI allows a user to   bring their imagination to Life by creating unique  Whimsical videos with just a text prompt it can   create unique videos that are full of vibrant  colors characters Landscapes and other details   it can even create videos from existing images  or create similar videos from existing videos   that are given to it Mena says they are committed  to developing responsible artificial intelligence   by ensuring safe use of the state-of-the-art  video generation technology so in order to   reduce harmful misleading and biased content  their research adheres to the following steps   first the neural network analyzes millions  of data points to gain insight into the world   and then meta applies filters to minimize the  possibility of generating harmful content second   because make a video is able to create videos that  look real they add a watermark to all videos which   is intended to let viewers know that the video  was created with AI and is not a recorded video   third while they hope to make the technology  widely accessible to the public soon until   then they will continue to thoroughly analyze  and test their artificial intelligence model   to ensure that each step of the release is safe  finally make a video uses publicly accessible   data sets giving their research an additional  level of transparency Mena says they are sharing   information openly in a research paper and they  plan to release a demonstration experience as   part of an ongoing commitment to open science  generative AI research encourages creativity by   giving people the tools to create new content  quickly and easily make a video is a follow-up   to meta's earlier announcement of make a scene  which is a multimodal generative AI method that   gives users more control over the AI generated  content they create meta demonstrated with make   a scene how users can create photorealistic  illustrations as well as storybook Quality   Art by using words lines or even freeform sketches  meta stated that it's important to think carefully   about how new generative AI systems are created  and they will continue to use their responsible   AI framework to further the refinement  and evolution of the emerging technology high resolution wearable Electro tactile rendering  device to simulate sense of touch in the metaverse   a team of researchers led by City University of  Hong Kong has created a wearable tactile rendering   system to provide long distance and simulated  touch with large spatial resolution and rapid   response the team showed its potential through a  braille display that adds the feeling of touch in   the metaverse to perform functions like gaming and  shopping in virtual reality as well as possibly   helping deep sea divers astronauts and other  workers who must wear gloves that are extremely   thick although there has been great progress in  developing sensors that digitally capture tactile   features with high resolution and high sensitivity  we still lack a system that can effectively   virtualize the sense of touch that can record and  play back tactile Sensations over space and time   working in collaboration with Chinese  tencent's robotics X laboratory the   team created a unique Electro tactile rendering  technique to provide diverse tactile Sensations   the results were published in the scientific  journal science advances the methods used to   recreate tactile Sensations can be divided into  two categories electrostimulation and mechanical   mechanical devices are used when you apply a  localized force or vibration to the surface   of the skin while these mechanical devices  are able to produce continuous and stable   tactile Sensations they can weigh a lot  which can limit their resolution when used   in the form of a wearable device Electro  tactile stimulators trigger sensations of   touch within the skin with electrodes that  send an electric current through the skin   they are lightweight and flexible while offering  better resolution and quicker response however the   majority of these depend on high voltage direct  current impulses which can be up to several   hundred volts in order to enter the topmost layer  of the skin to stimulate nerves and receptors   previously this could be dangerous and lack  high resolution but the most recent Electro   tactile actuator designed by the team  of researchers is thin flexible and safe   it easily fits into the finger cot and the device  provides a host of different tactile Sensations   including pressure vibration and roughness all  with very high resolution instead of DC pulses the   team devised an alternating stimulation technique  that is high frequency and was able to reduce the   operating voltage to 30 volts which ensures that  the tactile display is both comfortable and safe   they also suggested a new super resolution  method that allows tactile Sensations in   areas between electrodes too this improves the  resolution of their spatial stimulators by more   than 300 percent and provides the user with  a truly amazing and lifelike sense of touch   the new system can elicit tactile stimuli with  a very high degree of spatial resolution at 76   dots per centimeter squared which is similar to  the density of related receptors in the human   skin and a rapid response rate of 4 kilohertz the  team performed a variety of tests to illustrate   the different ways to use this wearable Electro  tactile rendering technology for instance they   suggested an Innovative Braille method that is  simpler for those who have a visual impairment   the new method splits the numerals and the  alphabet into distinct strokes and sequences   exactly the way the letters are written wearing  the electro tactile rendering device on the   fingertip the wearer is able to recognize  the alphabet displayed by detecting the   direction and sequence of Strokes by using  their fingertip this would be particularly   useful for people who lose their eyesight later  in life allowing them to continue to read and   write using the same alphabetic system without  the need to learn the whole Braille dot system   the Second Use case is for games and applications  that use virtual reality and augmented reality   bringing the sensation of touch to the virtual  world the electrodes are both flexible and   expandable meaning they can cover larger areas  such as the palm of one's hand the researchers   showed that users can feel the texture of clothing  in a virtual shop the users could also feel an   itchy sensation on their fingers when they were  kissed by a VR cat while stroking a real cat's   fur users will feel a change in the roughness  when Strokes shift in direction as well as speed   the system is also beneficial in transmitting fine  tactile information through thick gloves the team   was able to integrate these light thin electrodes  into a safety glove the sensor array detects the   distribution of pressure on the outside of the  glove and then relays the data to the user in   real time by triggering the user's senses in  the test the participant was able to swiftly   and precisely locate tiny steel washers that were  just one millimeter in radius and 0.44 millimeters   thick by using the tactile feedback that the  glove provided via its sensors and stimulation   the system's capabilities are evident and can  provide high quality tactile feedback which is   currently inaccessible to Firefighters astronauts  and deep sea divers who must wear heavy gloves and   protective suits this technology could benefit a  broad spectrum of applications such as information   transmission surgical training Tel operation and  multimedia entertainment artificial intelligence   turns a 100 000 equation quantum physics problem  into just four equations without losing accuracy   this work was published in the September  issue of physical review letters and it   stands to revolutionize the way scientists study  Quantum systems with many interacting electrons   the approach can also be adapted to other problems  such as helping in the design of materials that   have desirable properties like superconductivity  or utility for clean power generation   the researchers from flatteren institute's  Center for computational quantum physics started   from a large object of all the coupled together  differential quantum physics equations and they   use their machine learning model to turn it into  just a fraction of a fraction of its original size   the problem-solved concerns how electrons behave  as they're moving on a grid-like lattice and   interactions occur when two electrons  are located in the same lattice area   the configuration also known as the Hubbard model  allows scientists to model several important   materials while also allowing scientists to study  how electron Behavior leads to desirable phases   of matter such as superconductivity in which  electrons flow freely through a material the   model can also be used to test new methods before  they are applied to more complex Quantum systems   because when electrons interact with one another  they become mechanically entangled the Hubbard   model requires cutting-edge computational methods  and serious computing power even when working with   a small number of electrons therefore increasing  numbers of electrons makes the computational   challenge exponentially more difficult one way  of studying a Quantum system is by using what is   referred to as a renormalization group this is  a mathematical apparatus that physicists use in   order to study how the behavior of a system such  as the Hubbard model changes when scientists alter   properties like temperature or when they  examine the properties at different scales   a renormalization group which keeps track of all  possible electron couplings without sacrificing   any can have tens or hundreds of thousands or  even millions of equations that must be solved   these equations are so incredibly complicated  because each one represents an interaction   between two electrons the researchers wondered  if they could use a machine learning tool   called a neural network to help manage the  renormalization group the neural network   is like a cross between a frantic switchboard  operator and survival of the fittest evolution   the machine learning program first creates  connections within the full-sized renormalization   group then the neural network tweaks those  connections until it finds the smallest set   of equations that yields the same solution to  the original jumbo size renormalization group   even with only four equations the program  captured the quantum physics of the Hubbard model   all in all this machine learning method was able  to discover hidden patterns and it was able to   amaze researchers with its results which proved to  be more than they originally expected the machine   learning program took a lot of computational  power to train and it ran for several weeks   but now their program can be adapted to solve  other problems the researchers are currently   investigating what the machine learning model  actually learns about the system as this could   lead to additional insights that would otherwise  be difficult for physicists to uncover themselves   the biggest question remaining is whether  the new approach will work on complex Quantum   systems such as materials in which electrons  interact over long distances the researchers   say there are exciting opportunities  to use this machine learning technique   in other areas related to renormalization  groups including neuroscience and cosmology   Soft Robotics learn to grip  using the right amount of force   MIT researchers from the computer science and  artificial intelligence laboratory as well as   the Toyota Research Institute created a robotic  system for users to grasp the tools and use the   right amount of force to accomplish a task such  as squeegeeing liquids or writing words with a pen   the robotic system called ser

2022-11-24 21:03

Show Video

Other news