Unveiling SurrealDB 1.0.0 – SurrealDB World Keynote with Tobie Morgan Hitchcock

Unveiling SurrealDB 1.0.0 – SurrealDB World Keynote with Tobie Morgan Hitchcock

Show Video

[Music] SurrealDB: A Step Ahead [Applause] Welcome to   SurrealDB World 2023! I'm delighted that you  could join us today, as we release SurrealDB 1.0   and dive into several major  areas of database development.   It's been just one year since Jamie and I  launched our idea of what a database could   look like to the world, and the interest,  enthusiasm and uptake has been incredible.   All around the world we have developers, teams and  organisations building and creating applications   on top of SurrealDB in ways that we hadn't  even imagined, from embedded IOT devices,   offline research data stores for healthtech, to  gaming or traditional database deployments within   larger tech platforms. Today we have people who  have joined us from Europe, the USA, Asia, Africa   and of course here in London, with  over 2,000 people joining us online.   I think I can speak for the team to  say that we are really excited about   what we have been working on this  last year, and are releasing today.  

We know that SurrealDB will change how  developers and organisations build and   simplify their applications, taking  their projects to the next level.   Today has been the culmination of groundbreaking  work both within the SurrealDB team and with   contributions from the wider community,  and I can't wait to show you the result.   I'm excited to talk to you about SurrealDB  version one today, but before we dive into   the product I'd like to welcome onto the stage  Developer Relations Marketing Manager, Aravind,   to update you on the community and growth that we  have seen over the last year [Applause] [Music] Thank You Tobie. Wow, it's so great  to be here with you all today.  

Over the last year, the community and growth and  interest in the SurrealDB has been infectious.   During this time SurrealDB has  been downloaded over 250,000   times and has reached an astonishing 22,250  stars on GitHub, as I checked this morning.   Across our repositories we now have over 150  contributors and 500 active members, involved   with bug fixes, feature requests, documentation  improvements and community discussions.   On Discord, our members now total over  4,500, with hundreds more joining every week.   Members from the online community have made  incredible contributions not only to the core   database, but to client libraries in a number  of different languages, from Rust, JavaScript,   to Python and to Dart and Erlang. Many of our team  members have joined us from the community itself,  

to work on building SurrealDB into the future.  Internally, so that we can support and help   our global community evolve, our Community  Team at SurrealDB is now five people strong.   Over the next few months we will be building  on, and improving, our documentation,   deployment guides and tutorials, to enable  developers to get going with SurrealDB,   utilising all of its powerful functionality even  faster. The community to us is who we are, from   ideas and use cases all the way to contributions;  without you SurrealDB would not be what it is.   Join us at the 'Driven by the Community' talk  by my colleagues Naiyarah and Alex later here   on the same stage, where we will dive more  into the community that has shaped SurrealDB. [Applause] [Music] Thank you Aravind for that update. As Aravind  mentioned, the community is so important to us  

here at SurrealDB. Now, I'm sure that you will  all now have seen the SurrealDB clothing worn by   all of our team members, on our online videos,  and at our SurrealDB socials, or here today at   SurrealDB World. Today I'm pleased to announce  that the SurrealDB store is officially open.   Let's take a sneak peek. The launch of SurrealDB  store was about strengthening and extending the   brand, producing clothing that can be worn  by both developers and The Wider Community   too. All our clothing is exclusively produced by  Stanley Stella who specialise in sustainable and   ethically produced garments and who are dedicated  to using organic and eco-friendly materials.   Visit SurrealDB.Store to check it out.

At SurrealDB we have been working hard  internally to ensure that our clothing   line reflects the SurrealDB brand  and the things which we care about.   Originality, quality, sustainability,  design: these are some of the core   attributes which we always want to keep in  mind as we build SurrealDB into the future. Over the last year an important focus of ours has  been on the tools and interfaces which developers   use to interact and develop with SurrealDB. These  come in the form of our query language and our   client SDKs. We want SurrealDB to fit seamlessly  within developers' workflows and tech stacks.   In addition to improvements to the  SDKs for JavaScript and Golang,   we now also have community created  SDKs for Java, C#, and .NET.   But some of our most interesting  work has been on our Rust SDK,   which will form the basis of other client  libraries in the future. Client SDKs,  

built on top of the Rust engine, benefit from  the SurrealQL type system, local query parsing   and a binary communication protocol, which  leads to better code and improved performance.   On top of this users will be able to run SurrealDB  natively within the programming language of their   choice, using all the features that they currently  experience with the SurrealDB database server.   This month we'll be releasing SurrealDB.wasm,  SurrealDB.node, SurrealDB.deno and SurrealDB.py.   All of these client libraries will be built  on top of our native Rust SDK, and will   enable developers to run SurrealDB right within  JavaScript in the browser, or on the server side,   and within Python. Looking further  forward we'll be releasing an SDK for C,  

on top of which even more  native SDKs can be built. Now when we launched SurrealQL alongside  SurrealDB in August last year, to say that   there were certain 'opinions' about a new query  language would have been an understatement.   But with comments like "the 'S' in SQL  now stands for Surreal" spurring us on,   we have added an incredible  amount of functionality.   SurrealQL, with a host of new functionality and  statement types, is growing from an SQL-like   query language into its very own programming  language. Developers can now use even more   advanced expressions and logic to model and query  their data. FOR statements enable simplified  

iteration over data, or for advanced logic when  dealing with nested arrays or recursive functions.   The THROW statement can be used to return  custom error types, which allow for building   advanced programming and business logic right  within the database and authentication engine.   Code blocks and multi-line sub-queries can  be used alongside the looping and error   functionality and allow for nested blocks  of code with a single return type. This can  

build on top of any of the advanced SurrealQL  capabilities, making use of graph queries,   record linking and data aggregation functions. In  fact, over the last year we have added support for   global parameters, custom function definitions  or stored procedures, SurrealQL constants,   range queries, complex Record IDs and a new strict  typing system, all now available in SurrealQL 1.0.   All of this new and improved functionality  enables developers to build advanced logic   right within the database, or to query their data  remotely in simplified ways, saving development   time and enabling developers to focus on the  product. Alongside improvements to SurrealQL and   the introduction of additional functionality,  we've also enabled the ability for database   administrators to configure their database in  a more secure way. Capabilities introduced in   SurrealDB 1.0 enable fine-grained control of  the specific functions and network destinations  

that can be used when running a SurrealDB  server, or when operating in embedded mode. Now as a layered database platform, SurrealDB  operates with the storage separated from the   compute layer. As a result of this, SurrealDB  supports the ability to run as an embedded   database server in a number of programming  languages, as a single node server or   as a distributed database cluster. We now  enable running SurrealDB on top of RocksDB,   SpeedDB, FoundationDB and TiKV,  and on IndexedDB in the browser.  

While all of these Key-Value storage engines  have their own benefits and will be supported   by SurrealDB, a really exciting area of focus  this year has been on SurrealKV. SurrealKV is   our native embedded storage engine built entirely  in Rust. Unlike other B-tree based or LSM-tree   based data structures, SurrealKV builds upon TART,  our custom built timed adaptive Radix Trie which   forms the basis of concurrent and versioned data  storage in the SurrealKV storage engine. SurrealKV   will be optimised for multi-writer workloads with  the ability to query historically at any version. As a transaction-based ACID compliant data store  layer, SurrealKV will form the basis of Version   Control within SurrealDB, a foundational  feature which will enable us to support   data change auditing, graph versioning, historic  network analysis and aggregate queries over time.   This embedded SurrealKV engine itself is  optimised for large data sets and version values,   splitting the storage of keys (which will often  reside in memory) from the values which are more   likely to reside on disc. SurrealKV will  enable us to deploy SurrealDB natively in  

any programming language, without the need for  complex bindings with C libraries or packages.   Now we're not releasing SurrealKV with SurrealDB  Version 1 today, but we are really pleased with   the progress we have made and look forward to our  native storage engine being available in a future   release soon. In the meantime, however, you can  follow along the development progress online,   and today we are releasing SurrealKV as an  open-source project with an Apache 2.0 license. I'd now like to invite onto the stage  Software Engineer and Epidemiologist,   Dr Caroline Morton, to talk about how SurrealDB is   being introduced for research purposes  within a clinical setting in the NHS.   [Applause] [Music] Thanks Tobie. I'd like to introduce the concept  of how SurrealDB can be used to create dummy data.   Okay, so say you have a cough and you visit your  GP. You get your blood pressure taken, you get  

your lungs listened to and you get a diagnosis of  pneumonia, maybe you get some antibiotics. This   is the sort of thing which will get recorded.  So, every time you visit your GP or a hospital   your appointment gets stored as a series of time  stamped codes; snowmed codes for Primary Care,   ICD-10 codes for hospital, and this gets used  for research. So the underlying codes they can   they get basically picked and it's a big tree,  so for example cough, okay, I might put cough   down and that's a finding, but the parent code  of cough might be respiratory function finding,   and cough in itself might be, well it is a  parent to about 43 different codes; chesty cough,   allergic cough, all sorts of different types of  cough. Okay, the basic thing is it's it's a graph.   Okay, so you then researchers like myself, we  use statistical code to carry out research. Now,  

lots of researchers don't think of themselves  as programmers but they write code, but unlike   lots of startups no one ever really checks their  code and what they do the output of it is a paper   saying 'this is what I did', okay, and 'this is  what I found'. Now, I think we could agree that   the ideal situation would be that you share the  paper, you know the what you found, you also share   the code, so we can see exactly what you've done,  how it's been carried out, okay? And we can find   errors and it makes the results more believable.  But researchers don't share their code, and the   number one reason they don't share their code  is because the underlying data is not available.   Okay, and that's appropriate. We don't want, I'm  not advocating people uh you know release their   private medical data online so don't worry, so we  can't release the data but we do have a situation   now where we've got researchers writing code  that's quite important, and it's not really   being checked and they're working in these secure  environments and there's got lots of problems,   one of which is you know it doesn't get that  code doesn't typically get reused again, okay.  

So by releasing a fake or dummy data set alongside  the statistical code and the paper, I think this   situation could be resolved, and so if we think  about what do we want from our dummy data. So,   there's a few complicated aspects of this, so we  want similar code so I could code your pneumonia,   okay, as pneumonia - finding. Okay, but somebody  else perhaps my colleague next door will use a   different code maybe infective pneumonia, which  is a child code of pneumonia. Somebody else might   code it as cough requiring antibiotics, so the  data, the dummy data, needs to have this level of   complexity that researchers need because they need  to write statistical code which will capture all   of the different ways you could denote this person  has pneumonia. The second thing is conditions.  

Conditions are related to each other, so simple  example, if I was to take your blood pressure,   your systolic blood pressure, that's the top  number on the blood pressure reading, you probably   have had a diastolic blood pressure, the bottom  number, done at the same time. So those two things   go together. And also conditions have shared  risk factors, so we know that if you have had   a stroke in the past you're much more likely to  have a cardiovascular event like a heart attack,   so those things coexist together more commonly.  So why Surreal? So Surreal is a really good option  

for this and Toby's talked a little bit about  why this is but the one thing that I'm really,   really interested in is how we can traverse a  graph structure. So we want to model this complex   relationship between different codes or nodes in  a graph, this is something we can do in Surreal,   we also want to run code snippets as part of a  query, so as the node gets hit as we're traversing   that that graph and the node gets hit we can send  off an async thread and that will generate the   data records and eventually come back to produce  the dummy data. Relate statements are super useful   for finding similar codes and pragmatically it's  a single back end we could I hope one day have   a nice GUI on the front of it and that will be  available in a browser or maybe even a desktop app   and I'm gonna hand back over  to Tobie [Applause] [Music] Thank You Caroline. SurrealDB Version 1. We think that serial DB can  bring many benefits to applications of all sizes,   regardless of how they run. Whether embedded  on devices or running as a traditional database  

platform. SurrealDB Version 1, released today,  marks the beginning of SurrealDB's journey   towards a stable database platform suitable  for integration within large tech platforms.   To enable this, we have been working on three  integral core functionalities to SurrealDB. The   first of these features is Change Feeds. Here is  Senior Software Engineer Yusuke to explain more.  

In order to integrate SurrealDB within the wider  technology ecosystem, with SurrealDB version 1,   we are introducing Change Feed. This fundamental  feature provides change-data-capture functionality   to SurrealDB, enabling users and developers to  track and respond to changes as they occur within   the database. Whether exporting data in real  time into the third party systems, moving data   to object storage for backup or analysis purposes,  or even for real-time cross-cloud synchronisation   with other platforms, Change Feeds enable greater  interoperability with other technologies within   larger enterprise systems. In order to implement  this core functionality, whilst at the same time  

ensuring that it worked consistently, regardless  of database deployment setup or environment,   we needed to ensure that the logic itself was  separated from a storage area within the database.   Change feed functionality in SurrealDB sits  within the ACID-transaction layer of database   responding to any changes which occur, from  schema and index changes to records and changes   in the graph. Accessible to any database  user with the correct permissions level,   our initial implementation of Change Feeds can  be applied to individual table separately or   all tables within a database as a whole. What  this mean to a developer is that applications   can subscribe to specific data that they  need, without impacting the performance of   the database system or cluster. Change Feeds are  beneficial both as an externally facing feature   enabling users to retrieve the data as it changes,  and also as an internal feature. Looking forward   to the future. SurrealDB will use Change Feeds as  an underpinning of a number of long-running tasks,  

including a non-blocking indexing system.  This will enable SurrealDB to support   improved background indexing for traditional  full-text-search and vector embedding indexes,   allowing for zero-downtime asynchronous  generation or reconstruction of large data sets,   without any need for table or database locks.  Head to SurrealDB.com/cf for more information. [Applause]   Externally, Change Feeds will enable  SurrealDB to play a role within the   wider ecosystem of enterprise, cloud or  micro-service based platforms, giving users   the ability to retrieve and sync changes from  SurrealDB to external systems and platforms.   Internally, in the future, Change Feeds will  enable SurrealDB to handle long-running tasks,   including the rebuilding of unique, full-text  search and vector indexes asynchronously,   without any downtime or blocking. I'll let  experiences manager Lizzie go into a bit   more detail about how this will feature will  be beneficial for developers going forward.

Change Feeds in SurrealDB are a foundational  feature which form the underpinnings of change   data capture. Integral for both internal uses  to SurrealDB and user-facing benefits, Change   Feeds enable a multitude of use cases for any  application, from small projects to integration   within enterprise platforms. Change data capture  is the process of tracking changes in a database,   in order to synchronise those changes with  destination systems; this enables data integrity,   data backup and consistency across systems and  environments. From ingesting data into third-party   systems, archiving data to object storage for  backup or analysis purposes, or for real-time   synchronisation with other platforms Change  Feeds are a core feature for the enterprise.   In SurrealDB, Change Feeds can be enabled  on specific individual tables or applied   to all tables within a database with just a  single Surreal-QL command. Under the hood,   SurrealDB tracks all of the changes made to table  data by any user, whether running as an embedded   or single-node instance, or if running in a  distributed cluster with multiple SurrealDB nodes.  

As a database administrator, you are then  able to retrieve query data using the new   SHOW CHANGES command, allowing you to retrieve  changes since a particular version stamp,   or by specifying a date and time after which the  changes should be streamed. By introducing Change   Feeds within the core engine of SurrealDB, we  are enabling data consistency and synchronisation   with your external platforms, regardless of  the database setup or operating environment.   In turn, this feature enables you to use SurrealDB  as a central component of any enterprise,   cloud or micro-service based platform. Head  to SurrealDB.com/cf for more information. By introducing a new statement type, users can  now retrieve changes since a specific version   or timestamp for an entire database, or for  a specific table. Version-stamps enable exact   historic retrieval, whilst timestamps enable a  more developer friendly way of achieving changes. Our second major feature in  SurrealDB Version 1 is Live Queries.  

Here is Senior Software  engineer Hugh to explain more.   In order to enable modern, collaborative and  responsive applications to be built on top of   SurrealDB, we decided to go one step further.  Live Queries, although similar to Change Feeds,   open up a whole new type of application to be  built on top of SurrealDB. Whilst Change Feeds   give a historic view over time of the changes to  a database or specific database tables, with the   ability to listen to changes since a specific  point in time, Live Queries give developers the   ability to receive real-time change notifications  to data as it is happening, but without any   ability to subscribe to historic changes. The big  difference, however, is that while Change Feeds   can be accessed by database administrators, Live  Queries are integrated directly within the table   row and field level permissions of SurrealDB.  What this means to a developer is that each Live  

Query notification is unique and tailored to the  authentication of the user who issued the query.   When looking at other real-time databases,  streaming functionality can be built in   by subscribing to the database changes.  Authentication and permissions logic needs to   be built in a custom API layer sitting in front of  the database. With Live Queries, users can build  

applications that respond to specific document  changes, full table updates or aggregate table   views with just a single select query using field  projections or SurrealQL functions if desired.   In order to implement this functionality so  that it works both when running as an embedded   database or in a highly scalable distributed  cluster, we need to build it as a layer above   the data storage engine, using a combination of  in-memory and persistent storage based techniques.   This ensures that any change initiated from any  SurrealDB node in a cluster can be sent to the   relevant node processing the live query, with  ordered, at-most-once delivery characteristics.   The applications where live queries can be of  benefit are numerous ,whether for live updating   user interfaces, real-time game notifications,  dashboard visualisations, collaborative   diff-patch-match based editing, live updating  activity feeds live chat, or even for responsive   geofencing detection, Live Queries offer a much  needed feature with effortless integration,   and when pairing this we've predefined  aggregate views the functionality becomes   even more powerful. We can't wait to see what  people will build using this functionality. [Applause] As Hugh said, we really can't wait to see  what people will build using Live Queries.  

What was previously as complex as synchronising  changes between multiple different databases,   dealing with the permissions, authentication  and business logic in a custom API layer   and then handling the real-time  communication with external users,   now is possible by connecting directly  to SurrealDB and issuing a single query.   And this is because of the permissions and  authentication layer built directly into the   database. Here is Developer Experience Engineer  Obinna to talk about the benefits that this will   bring to developers and users. Live Queries in  SurrealDB enable a simple yet seamless way of   building modern responsive applications, whether  connecting to SurrealDB as a traditional backend   database or connecting directly to the database  from the front end. With just a single query you   can now subscribe to changes as they happen  in the database, either a whole table, or by   filtering the real-time notifications, so that  only the desired change data is delivered. With   just a single word addition to the traditional  SELECT query, a Live Query enables you to select   all fields from a document of projected fields.  In addition, by using a native JSON-diff-patch  

implementation even the exact concise document  changes can be received whenever a document is   modified. Live Queries are built right into  the core of the database and benefit from all   the functionalities that you can use elsewhere  on the SurrealDB platform. Take for example the   need to modify each change notification as  it is delivered to your users. Here, custom   functions can be used within the field projections  to alter the data before it is sent to the client.  

Most importantly, however, Live Queries in  SurrealDB are fully backed by the powerful   authentication and permissions layer, meaning  that regardless of what a user has subscribed to,   notifications will only be delivered based on  the authenticated session of that user. This all   happens seamlessly within SurrealDB in the same  way it does for normal SurrealQL statements. By   bringing the simplicity of Live Queries alongside  the advanced nature of predefined aggregate views,   you can now build powerful dashboards that rely on  aggregate data queries, computationally expensive   analytics queries and filter collections of  massive data sets, that updates in real time   as your data and your database changes. Head to  SurrealDB.com/iq for more information. [Applause]

Live Queries is such a powerful feature,  allowing you to take a simple SELECT statement   and turn it into a subscription-based  query with change notifications.   But the real power comes when you combine  predefined aggregate views with Live Query   functionality, allowing you to subscribe  to aggregated data as it changes over time   with support for custom grouping, rolling  averages and grouped minima and maxima.   This is perfect for live updating dashboards,  charts and visual displays. Our third major   feature in this release is Indexing. Here is  Senior Software Engineer Emmanuel to explain more.   an integral part of any database system is  the secondary indexing used to optimise and   improve the performance of database queries and  data analysis. With SurrealDB, when it came to  

implementing indexes within the database we wanted  to ensure that whatever approach we took it would   enable us to offer the same functionality whether  running as an embedded database, single-node   database with vertical scaling or horizontally  scalable distributed database cluster.   In SurrealDB version 1 we are really excited  to now have support for traditional indexing,   unique indexes and constraints, full-text  search indexes and vector embedding indexing.   The quickest and simplest approach with  indexing would perhaps have been to rely   on any of the popular indexing libraries or  third-party platforms but this would have   limited the functionality and the applications  where the indexing could have been used.   Instead we reimagined how indexing might be  implemented, opting for a completely custom   built indexing engine which sits within the  SurrealDB core itself. The engine is agnostic  

to its deployment environment whether running  on top of IndexedDB in the browser, an embedded   runtime in Rust or Python, or distributed over  multiple nodes in a highly scalable cluster.   With this approach, instead of passing the  document indexing query parsing and data structure   storage to an external library or platform,  SurrealDB enters all of this logic itself directly   within the ACID transaction model of the database.  What this means for a developer using SurrealDB   is that the indexing engine is able  to integrate and interoperate with the   SurrealQL query language natively, without the  needs for an additional external query language   or for indexing specific functions or plugins.  Looking at the indexing functionality itself,  

we are really excited about what can already  be achieved. For traditional and unique indexes   SurrealDB already supports simple single  field indexes, multi-filled compound indexes,   nested object and array fields and also as support  for flattened indexing of array data. With full   text search, SurrealDB allows developers to define  custom analyzers which specify exactly how their   text data should be processed, with support for  multiple tokenizers advance and filters including   Ngram, EdgeNgram and Snowball and support for  17 languages from English to Arabic. With vector  

embedding indexing our initial implementation  supports exact nearest neighbour retrieval for   vectors of arbitrary size using Metric Trees,  with support for HNSW-based approximate nearest   neighbours retrieval coming in the future.  Along with the indexing, SurrealDB Version   1 now has support for explaining complexity of  any query which selects data from the database   allowing developers to understand the performance  implications and index usage of their SurrealQL   queries, and also gives users the ability to  force the database to use a specific index. As the   index data is stored directly within the storage  engine and not within the query nodes themselves,   this opens up the possibilities of how data can  be indexed and queried at scale with SurrealDB.  

And with the indexing functionality implemented  natively within SurrealQL, we are really   excited to see the uses and applications  which can benefit from this technology [Applause] Indexing is such an important  piece of any database platform   and we're really excited with  this initial implementation.   Vector embedding indexing is now available  in SurrealDB Version 1 as a beta feature,   and we'll be working on the performance aspects  of the indexing engine over the coming months.   Here is Developer Advocate Pratim to go  into how this can be used by developers   and how it will affect applications  built on top of SurrealDB. SurrealDB   is designed for building applications of any  size whether an indie-project or an enterprise   platform, for that, query performance and  improved data analysis workloads are key.  

With SurrealDB secondary indexes you can now  index data using traditional indexes, full text   search indexing and vector-embedding-search  for artificial intelligence use cases.   All of these index types are native to the  database meaning that they interoperate with   the SurrealQL-query-language and work the same  way whether running on top of IndexedDB in the   browser, an embedded runtime in Rust or Python,  or distributed over multiple nodes in a highly   scalable cluster. For you that means that defining  and implementing these indexes can be as simple   as running a single query. Take for instance a  multi-field compound index with nested array data   using a single-index-definition-statement, we can  easily implement this index specifying whether   arrays of values should be flattened into  separate-index-entries or not. For full text   search indexes more options exist for configuring  the indexing behaviour. Custom analyzers allow you   to specify which tokenization methods are used  to split text into boundaries and a range of   filtering algorithms, including Ascii, Lowercase,  Uppercase, NGram, EdgeNGram and Snowball.  

They allow for advanced processing and stemming of  all types of text in a large number of languages.   When retrieving indexed results  sophisticated methods for matching   term highlighting allow for effortless  integration with front-end interfaces.   For indexing and searching AI based vector  embeddings, SurrealDB now includes native   support for exact nearest neighbour retrieval  using Metric Trees. Similarly to the other   index types these indexes are simple to  set up and native to the database score,   whether using SurrealDB as an embedded database,  a single node server or scalable database cluster   the indexing functionality is designed to  work seamlessly, giving you the power and   performance that you can expect from a database.  head to SurrealDB/ix for more information. [Applause]   Built directly into the SurrealQL query language,   SurrealDB supports many different index types  with a whole range of configuration options.   Unique indexes allow for data constraints  on single or multiple fields in a record,   traditional indexes have support for multiple  fields, compound indexes on array values or   nested object values within arrays with the  ability to combine multiple fields together.

Full-text search indexes are simple to  define and once again sit natively within   the SurrealQL query language, enabling users to  efficiently index, search and retrieve results   using relevance and scoring functionality.  Vector embedding indexes mean that when   working with artificial intelligence data and  large language model data the index information   can reside right next to the data itself. All  of these indexing functions and functionality   works the same way whether running as an  embedded database or as a distributed cluster. Before I leave you to enjoy the rest of the day   there is one more feature  we are introducing today.

SurrealML is our first step towards bringing  Machine Learning to the Surreal ecosystem.   Here is Senior Software Engineer Maxwell to  explain more. One feature we're extremely excited   to introduce in SurrealDB Version 1 is SurrealML.  Instead of just a feature within the database,  

SurrealML is a whole suite of tools which is  just the beginning of bringing machine learning   and artificial intelligence workflows, inference  and reasoning into the database itself. With this   release we are introducing a new SurrealML file  type for working with PyTorch and SKLearn models   in Python. This file format powered by our Rust  runtime allows machine learning model developers   to train in Python and save the model and metadata  to a portable and open source file format,   allowing for seamless model versioning and  execution across different Python versions,   environments and platforms. Although powerful  in its own right, the real benefit comes from  

the ability to bring these pre-trained models  into SurrealDB enabling model inference within   Rust-based SurrealDB runtime. With embedded  metadata and data normalisation logic stored   within the sSurrealML file, the surrounding  and pre-trained internal model the database   runtime understands what arguments and values  the model expects, allowing any data within the   database to be inferred against the supply model.  Whether running as an embedded database instance,   as a single node database server or a distributed  database cluster, the machine learning engine in   SurrealDB scales effortlessly to meet  the demands of today's applications.   With SurrealML we are taking the flexibility  and ecosystem of machine learning in Python and   bringing it alongside the power and performance of  Rust and SurrealDB. Whether working with raw data   inputs or simplified model arguments, SurrealDB  extends the power of Python machine learning   without changing the traditional approach to  implementing machine learning workflows in Python.   This is just the beginning of our journey to bring  machine learning and artificial intelligence to   SurrealDB we are eager to see how SurrealML and  the accompanying tools alongside the machine   learning engine in SurrealDB, power and enable  applications within any industry, from indie   projects to startup products to mission critical  enterprise applications operating at scale.

[Applause] With the introduction of  our own file format with metadata and   versioning included within the file,  machine learning models can be greatly   simplified, ensuring reproducibility and  consistency in machine learning pipelines.   Today, SurrealML can be used in beta within  Python, with any PyTorch and most SKLearn models.   In the future we will be integrating SurrealML  with the Hugging Face ecosystem so that Large   Language Models can be used and transported into  SurrealDB for inference and reasoning directly on   your data. Here is Software Engineer Misha to dive  into how SurrealML can be used with SurrealDB.   With SurrealML you can now use a portable  and open source file format to package and   embed PyTorch an SKLearn machine learning models  whilst at the same time storing model metadata and   normalisation logic alongside pre-trained models.  This allows for effortless versioning of different  

machine learning models and interoperability  across different Python software versions,   environments and platforms. Using our Rust-based  SurrealML runtime, the file header stores the   model name,, description version and data  normalisation logic. This means that the   processing logic that is required for all data as  it is passed into a model for inference purposes,   no longer has to sit separately in the Python  deployment. Instead, as this business logic now   sits alongside the model itself, the stability  and reproducibility of each model is guaranteed.  

In addition to the benefits that the SurrealML  file format brings to Python, the Rust-powered   machine learning engine in SurrealDB now supports  ingesting a fully packaged SurrealML file,   enabling performant inference within SurrealDB,  whether at scale or embedded on any device.   To import a pre-trained Python model into  SurrealDB, a single command can be used to   simply add this model to your database, scaling  across database cluster nodes if desired.   SurrealDB automatically reads and understands  the model requirements immediately setting up a   custom inbuilt function which can be used  to infer results from the model itself.  

Inference works with either raw data inputs  for advanced usage or with field name key   bindings packaged into the SurrealML  file format itself. This means that   seamless integration with SurrealQL object types  allows you to work more quickly and consistently   with models as they are updated. Head over to  SurrealDB.com/ML for more information. [Applause]   Using an HTTP root and combining this with  SurrealQL, we can now import our models   directly into SurrealDB effortlessly. SurrealDB  automatically reads and understands the model   requirements immediately setting up a custom  inbuilt function which can be used to infer   results from the model itself. Inference works  with either raw data inputs for advanced usage or  

with the field name key bindings packaged into the  SurrealML file format. Developers no longer have   to use external platforms or systems to run model  predictions against data residing in the database.   Instead the model logic can sit directly within  the SurrealQL query language, extending the power   of custom functions. As with Live Queries the real  power comes when we combine this functionality  

with the other powerful features within SurrealQL.   Model inference can now be used seamlessly within  events, Live Queries and other custom functions   both as a traditional database backend,  or connected to directly from the browser.   SurrealML within SurrealDB will be  released in beta this month so that   you can start working with machine learning  models right within the database with ease. thank you to all the team members whose hard work  has been instrumental in the launch of SurrealDB   Version 1, and thank you also for the incredible  Experiences Team that has made SurrealDB World   possible today. This is just the first step  in SurrealDB's journey to simplify the lives  

of developers and enable applications to interact  with and build upon data in a newly imagined way.   Each of the features announced today from  Live Queries to indexing and to the start   of bringing intelligence to SurrealDB brings  its own power yet simplicity to applications,   but it is the ability to combine all of this  formidable functionality together within the   Surreal ecosystem that will really lead  to unbounded possibilities. Thank you.

2023-10-05 16:35

Show Video

Other news