Deliver search-friendly JavaScript-powered websites (Google I/O '18)

Deliver search-friendly JavaScript-powered websites (Google I/O '18)

Show Video

Good. Morning everyone my name is Tom Greenway and I'm a partner developer, advocate, from Google Sydney with a focus, on the index ability, of progressive, web applications, hi. Everyone I am John, Mueller I know webmaster, trends analyst, from Zurich in Switzerland it's. Great to see so many of you here even, at this early hour. Now. As, you can imagine John, I have a lot of experience, the work web developers, must do to ensure their, websites are indexable which, is another way of saying whether a web page can be found and understood by search engines. But. Do search engines see all web pages exactly, the same way are some pages more complex, than others and what about modern, JavaScript powered, websites today. We'll be taking a closer look into what it takes for a modern JavaScript powered website to be properly indexed by search crawlers and especially, Google search and. I'm excited to tell you that in this talk we're, announcing a bunch of cool new stuff including. A new change to Google search policy a new. Approach for rendering HTML to, search crawlers, and even, a new Google, search console, tool it. Sounds like a lot of stuff right well, that's because it is so, let's get started now. A, long time ago before I joined Google I was building ecommerce sites and I personally felt there was a lot of mystery at times behind, Google, search especially on the topic of Index ability I would. Wonder why do some of my pages appear, in Google search and some, don't and what's, the difference between them will. JavaScript, be rendered correctly will. JavaScript, rendered content appear properly and be indexed, and is, lazy loading, an image safe, to do these. Are really critical questions and as developers ourselves, we understand, the frustration behind. This mystery so. Today, John and I are going to do something we very rarely do at Google we're, going to pull back the curtain a little bit and reveal some new pieces of information about how Google search sees, the web and indexes, it and with. This new knowledge and a few new tools you'll. Have concrete steps you can take to, ensure the JavaScript, powered websites you're building are visible. To do search. Now. All I, want to remind you that this talk is about modern, JavaScript powered, websites and typically. These websites, will be powered by a JavaScript frameworks, such as angular. Palomar, react. Or Vijay s and, who doesn't love a great web development framework, that's easy to use and helps you build your site's faster and works great for your users. But. It's important to recognize that some of these frameworks use, a single page app configuration model, meaning, they use a single HTML file that pulls in a bunch of JavaScript and that, can make a lot of stuff simpler, but if you don't watch out JavaScript.

Powered Websites, can, be a problem for search engines so. Let's take a quick look at what the default template, for a new angular, project looks like. As. You can see the default project template is pretty basic it. Shows you how to use angular to render a header and image and a few links nice, and simple, how. Could this possibly be a problem from an index ability perspective, well. Let's take a peek behind the scenes at the HTML. This. Is it take. A good look when. Viewed in the browser the default sample project, had, text, imagery, and links but, you wouldn't know that from looking at this initial, HTML that's been delivered from the server now would you the. Initial HTML that's been sent down is actually, completely devoid of any content see. Here in the app root that's. All there is in the body of the page except for some script tags. So. Some, search engine set some, search engines might assume that there's actually nothing here to index. And. To be clear angular, isn't the only web framework that serves an empty, response on, its initial server-side, render polymer, react and bjs have similar issues by default so. What does this mean for the index ability of our websites from the perspective, of google search. Well. To answer that question better we'll, take a little step back and, talk about the web in general why, search engines exists, and why search crawlers are necessary. Perhaps. A good question to start with is how big is the web well. We can tell you that we've actually found over 130. Trillion. Documents, on the web so. In other words it's really big and. As you know the aim of all search engines including Google is, to provide a list of relevant search results based on the user's search query and, to make that mapping of user queries the search results fast and accurate we, need an index similar. To the catalog of a gigantic library, and. Given the size of the web that's a really complex task and. So. To build this index to power our search engine we need another tool a search, for one and. Traditionally a search crawler was basically just a computer and a piece of software that performed two key steps one. It aims to find a piece of content to be crawled and to do this the content must be retrievable via URL, and. Once we have a URL we get its content and we sift through the HTML to index the page and, find new links to crawl as well and thus, the cycle repeats. So. Let's look at that first step the crawling and break it down oh and. Yes as an Australian I thought it was imperative that I include some spiders in my talk so this is the cues possible one I could find John. What do you think. No. You're not convinced okay well I have a few more in the deck so, maybe, we'll come around, so. To ensure the crawling is possible, there, are some key things to keep in mind firstly. We need URLs to be reachable as in, there should be any issue when, the, crawler. Wants, to request the web pages and retrieve the resources, necessary for indexing, them from, your web server and. Secondly if there are multiple documents, that contain the same content we need a way to identify the original source otherwise, it could be interpreted as duplicate, content and finally. We also want our web pages to have clean, unique, URLs, originally. This was pretty straightforward on the web but then the first single page web was, first single page apps main things be more complicated so. Let's go to each of these concepts. First. For the reach ability of URLs there's a simple standard way to help search engines find content that you're probably familiar with you, had a plain text file called robots.txt. To the top-level domain of your side which. Specifies, which URLs to crawl in which to ignore and. I say URLs, because these rules can prevent JavaScript, from being crawled too which could affect your index ability and. This example also gives us a link to a site map a site, map helps crawlers by providing a recommended Center for URLs to crawl initially, for a and, to be Cara to be clear there's no guarantee, these URLs will get crawled they're just one of the signals that search crawlers will consider.

Okay. But now let's talk about that duplicate, content scenario, and how search crawlers deal with this situation. Sometimes. Websites want, multiple pages to have the same content right even, if it's a different website for. Example some, bloggers. Will publish articles on their website and cross post to services like medium to increase the reach of that content and this is called content, syndication, but. It's important for search crawlers to understand, which, URL, you prefer to have indexed, so. The canonical metadata, syntax, shown here in the HTML allows. The Jupiter documents, to communicate. To crawlers where the original authoritative, source. For the content lives we, call that source document, the canonical document. And. Traditionally, URLs for the web started out quite simple just. A URL that was special a server with, some HTML but. Then of course Ajax, came along and it just changed everything suddenly. Websites, could execute JavaScript, which could fetch new content from the server without reloading, the browser page the, developers, still wanted a way to support back and forward browser navigation, and history, as well so. A trick was invented which leveraged something called the fragment identifier and its purposes, for deep linking into the sub content, of a page like. A subsection of an encyclopedia article and. Because. Fragment identifiers were, supported, by browsers for history and navigation, this meant developers could trick the browser into fetching new content, dynamically. Without reloading the browser page and yet also support the history in the navigation we loved about the web but. We realized that using the fragment identifier for two purposes subsections. On pages and also deep linking into content, it wasn't very elegant, so, we moved away from that and, instead. Another approach was proposed to, use the fragment identifier, followed, by an exclamation mark, and which. We called the hash Bank and this. Way we could discern the difference between a traditional URL. Using the fragment identifier, for, the sub content on the page versus. A fragment, identifier, being used by JavaScript, to deep link into a page and this technique, was recommended, for a while, however. Nowadays, there is a modern JavaScript API, the, these old techniques less necessary and it's, called history API. And. It's great because it enables managing, the history of the history state of the URL without, requiring complete, reloads of the browser all through JavaScript, so, we get the best of both worlds, dynamically. Fetched content with clean traditional, URLs and. I can tell you that from Google's perspective we, no longer index. That single hash work, around and we discourage, the use of the hash Bank trick as well. Okay. Well that's, that's crawling out of the way now, let's move on to the indexing step. So. Web crawlers ideally wanted we have to find all the content on your website if the crawlers can't see some content than how they going to index it and the, core content of the page includes all the text imagery vo and even hidden elements like structured metadata in, other words it's the HTML of the page but. Don't forget about that content you dynamically, fetched either, this, could be worth indexing, as well such as you, know Facebook or discuss comments, crawlers, want to see this embedded content too. And. Also this might seem really obvious but I want to emphasize that at Google we take HTTP, codes pretty seriously, especially, for, or for not bound codes if, Carla's, find a page that has a 404 status, code then, they probably won't even bother indexing, it and, lastly of course a crawler wants to find all the links on a page as well because, these links, allow the crawlers to crawl further. So. Now let's let's, just talk a bit about those links quickly because honestly, there's some of the most important parts of the web, how.

Do You search crawlers like Google find links well. I can't speak for all search crawlers but I can't say that at Google we, only analyze one thing anchor. Tags with href attributes, and that's it, for. Example this ban here that I've just added it won't get crawled because it's not an anchor and this. Additional. Span of added even though it's an anchor he doesn't have an H retro attribute. But. If you are using JavaScript, such as with the history API that I mentioned earlier to navigate the page purely, on the client and fetch new content, dynamically, you can't do that so long as use the anchor tags with hrf attributes, like in this last example because. Most search crawlers including, Google will, not simulate, simulate, a simulation. Of a page find links only, the anchor tags will be followed for linking. Away. Is that really everything in order. To have sifted through the HTML to index the page we needed to have the HTML in the first place and in, the early days of the web vii likely gave us all the HTML that was necessary, but, nowadays that's not really the case so. Let's insert, a step between crawling, and indexing because, we need to recognize that the search crawlers themselves, might, need to take on this rendering task as well, otherwise. Out will the search crawler understand, the modern JavaScript powered, websites we're building, because. These sites are rendering that h3 on the browser itself using javascript and templating. Frameworks just. Like that angular sample, I showed you earlier. So. When I say rendering, I don't mean drawing pixels to the screen I'm talking about the actual construction of the HTML itself and ultimately. This can only ever happen on either the server or on the client or a, combination, of the two could be used and we call that hybrid rendering, now. If it's already pre-rendered on the server then a search engine could just index that HTML immediately, but, if it's rendered on the client, then things get a little bit trickier right and so, that's going to be the challenge that we'll be discussing today.

But. One last term you might be wondering what does Google search is crawler called well. We call it Googlebot and we'll be referring it to a lot in this talk and, I, think another detail to note is that I said that a search crawl is basically just a computer with some software running on it well. Obviously maybe, in the 90s that was the case but nowadays, due to just the sheer size of the web Googlebot. Is comprised just thousands, of machines, running. You know all this distributive, software that's constantly, crunching, data to understand, all of this continuously, expanding information, on the web and. To be honest I think we sometimes take for granted just how incredible, Google search really is for. Example I recently learned that with the knowledge graph we, haven't which was a database of all the information we have on the web it, actually maps out how more than 1 billion things in the real world are connected and over, 70, billion facts, between them it's, kind of amazing. Ok. Well now that we know the principles, of a search grow up let's see how these three different key steps crawling, rendering and indexing, all connect. Because, one crucial thing to understand, is the cycle of how Googlebot works, or how, it should. As. You can see we want these three steps to hand over to one and uh instantly, and. As soon as the content is fully, rendered we want to index it to keep the Google search index as fresh as possible, this. Sale sounds simple right well, it, would be if all the content was rendered on the server and complete, when, we crawl it, but. As you know if a site uses client-side rendering then that's not going to be the case just like that angular sample, I showed you earlier so. What does Googlebot do in this situation, well, Googlebot, includes its own renderer, which is able to run when it encounters pages with JavaScript, but. Rendering, pages at the scale of the web requires. A lot of time and computational, resources, and. Make no mistake this is a serious, challenge for search Boas Googlebot.

Included, And. So we come to the important truth about Google search we would like to share with you today which. Is that currently, the rendering, of JavaScript out websites in Google search is, actually deferred, until Googlebot, has the resources, available to process that content. Now. You might be thinking okay, well, what does that really mean well. I'll show you. In. Reality Google, BOTS process, looks a bit different we, crawl a page we, fetch the server-side rendered content and then rerun some initial, indexing, on that document, but. Rendering, the JavaScript, powered web pages takes processing, power and memory and, while Googlebot is very very powerful doesn't have infinite, resources so. If the page has JavaScript in it the rendering is actually deferred until, we have the resources ready to render the client-side content, and then we index the content further. So. Googlebot might index a page before rendering is complete, and, the final render can actually arrive several days later and, when. That final render does arrive then, we perform another wave of indexing, on that client-side, rendered, content and. This effectively, means that if your site is using a heavy amount of client-side. JavaScript for. Rendering you could be tripped up at times when your content is being indexed due to the nature of this two phase indexing. Process. And. So ultimately what I'm really trying to say is because. Google's Googlebot, actually. Runs two waves of indexing, across web content it's possible some details might be missed, for, example. If, your side is a progressive web application, you've built it around the single page app model then, it's likely all your unique URLs share, some base template of resources, which are then filled in with content via Ajax or fetch requests, and, if that's the case consider, this did. The initially server-side, rendered version of the page have the correct canonical, URL, included, in it because, if you're relying on that to be rendered by the client, then we'll actually completely, miss it because, that second wave of indexing, doesn't, check for the canonical tag at all, additionally. If the user requested, a URL that doesn't exist, and you attempt to use, JavaScript to send the user a 404, page then, we're actually going to miss that too. Now. John and i on will talk more about these issues later in the talk but. The important thing to take away right now is that these really aren't minor issues these. Are real issues that could affect your index ability you know metadata canonical. Tags HTTP, codes as I mentioned at the beginning of this talk these are all really, key to how search pros understand, the content, on your web pages. However. Just to be clear not, all web pages on a website necessarily. Need to be indexed for. Example actually, on that Google IO shuttle website there is a listing and filter interface for the sessions and, we want to search crawlers to find the individual session pages but, we discovered the client-side rendered deep links weren't being indexed but because the canonical tags were rendered in the client and the urls would fragment identifier based, so. We implemented a new template with clean URLs, and service ID rendered canonical, tags to, ensure this session descriptions, were properly indexed because, we care about that content and to, ensure these documents were crawlable we added them to the sitemap as well, but. What about the single page app which allows for filtering, sessions well, that's more of a tool than a piece of content right therefore, it's not as important, to index the HTML on that page so ask.

Yourself This do, the pages I care about from the perspective. Of content and indexing, use, client-side, rendering anyway. Okay, so now you know when, building a client-side rendered website you must tread carefully as the. Web and the industry has gone bigger so two of the teams and companies become more complex, we. Now work in a world where the people building, websites aren't necessarily, the same people promoting or marketing those websites and, so this challenge is one that we're all facing together as an industry, both. From Google's perspective and, yours as developers, because. After, all you want your content indexed by search engines, and so. Do we. Well. This seems like a good opportunity to change tracks so John do you want to take over and tell everyone about the Google search policy changes and some of the best practices they can apply so we can meet this challenge together. Sure. Thanks. Tom that, was a great summary of how search, works though. I still, don't know about those pictures of spiders. That. Is scary but, Googlebot in reality, is actually, quite friendly anyway. As Tom. Mentioned the, indexing, of modern, JavaScript powered, websites, is a challenge, it's a challenge both. For Google as a search engine and for. You all as developers, of the modern way and while, developments, on our side are still ongoing we'd, like to help you to tackle this challenge in a more systematic, way so, for that we'll look at three things here the, policy change that we, mentioned briefly before, some. New tools that, are available to help you diagnose. These issues a little bit better and lastly. A bunch of best practices, to help you to, make better JavaScript. Powered websites that also work well in search. So. We've talked about client-side. Rendering briefly. And server-side, rendering, already, client-side. Rendering is, a traditional, state where javascript, is processed, on the client that would be the users browser or. On a search engine for, server-side. Rendering, the server so. Your server will process, the JavaScript, and serve mostly, static HTML to, search engines often, this, also has speed advantages, so especially. On, lower-end. Devices on, mobile devices javascript, can take a bit of time to run, this is a good practice for both of these we.

Index, The state as ultimately. Seen in the browser so. That's. What we pick up and we try to render pages when we need to do that there's a third type of rendering, that we've talked about in the past it starts. In the same way in that. Pre-rendered. HTML is sent to the client so, you have the same speed advantages, there, however. On interaction. Or after, the initial page load the, server adds JavaScript. On top, of that and as. With, server-side, rendering, our job as a search engine is pretty, easy here we, just pick, up their own pre-rendered, HTML, content we, call this hybrid, rendering this, is actually, our long-term. Recommendation. We think this is probably where things will end up in the long run however, in practice, implementing, this can, still be a bit tricky, and most frameworks, don't make this even. A. Quick call out to angular since we featured, them in the beginning as an example of a page, that was hard to pick, up they. Have built, a hybrid rendering, mode with angular Universal, that helps you to, do this a little bit easier over, time I imagine, more frameworks, will have something similar to make it easier, for you to do this in practice however, at least at the moment if your server isn't written in JavaScript, you're, going to be dealing with kind. Of double maintenance, of controlling, and templating, logic as well so. What's. Another option what's, another way that Java Script sites could, work well with search we. Have a another, option that we'd like to introduce we. Call it dynamic rendering. In. A nutshell, dynamic. Rendering is the principle of sending normal. Client-side. Rendered content, to users and sending. Fully server-side, rendered content to, search engines and two other crawlers, that need it this, is the policy, change that we talked about before. So. We call it dynamic, because. Your site dynamically. Detects, whether or not the request there is a search engine crawler, like, Googlebot and only. Then sends, the server-side rendered, content directly. To the client. You can include other web services here as well that can't, deal with rendering for, example maybe social media services, or chat services, anything, that tries to extract structured. Information from your pages and for. All other requesters, so your normal users you, would serve your normal hybrid, or client-side, rendered code this. Also gives, you the best of both worlds and makes. It easy for you to migrate to hybrid, rendering for your users over time as well. One. Thing to note this is not a requirement, for JavaScript, sites to be indexed, as you'll see later Googlebot. Can render most, pages already. For. A dynamic rendering our recommendation, is to add a new tool or step in your server infrastructure, to act as a dynamic renderer, this. Reads, your normal. Client-side. Content, and sends, the pre-rendered, version to search engine crawlers so, how. Might you implement, that. We. Have two options here that help. You to kind of get started the first is puppeteer, which is an ojs library, which wraps a headless version of google chrome underneath, this, allows you to render pages on your own another. Option, is render try which, is which. You can run as a software-as-a-service, that.

Renders, And caches your content, on your site as well both, of these are open source so, you could make your own version or, use something from a third party that does something similar as well for. More information, on these I'd recommend, checking out the i/o session, on headless. Crown I believe there's a recording about that already, either. Way keep, in mind rendering, can be pretty resource intensive, so, we, recommend doing this out-of-band, from your normal web server and implementing. Caching, as as. You need it so. Let's, take a quick look at what your server infrastructure, might look like with a dynamic renderer. Integrated. Request, from Googlebot come, in on the side here, they're. Sent to your normal server and, then perhaps through a reverse proxy they're, sent to the dynamic, render there, it requests, a render is a complete. Final page. And sends, that back to, the. Search engines so without. Needing, to implement or maintain any new code this setup could enable a website, that's, designed only, for client-side, rendering to. Perform dynamic rendering, of the content to Googlebot and to other appropriate. Clients, if. You think about it this kind of solves, the problems that Tom mentioned before. And now, it can be kind of confident, that the important content of our web pages is available, to Googlebot when it performs its initial wave of indexing, so. How, might you recognize, Googlebot, requests, this. Is actually pretty easy so, the easiest way to do that is to find, Googlebot, in the user agent string you. Can do something similar for other services, that. You want to serve pre-rendered, content to and for, Googlebot as well as some others you can also do a reverse DNS lookup if you want to be sure that, you're serving it just to legitimate, clients. One, thing to kind of watch out for here is that if you serve adapted. Content, to smartphone, users versus, desktop users, or you redirect users, to different urls depending, on the device that they use you. Must make sure that dynamic, rendering also, returns device focused, content in, other words mobile. Search engine crawlers when, they go to your web pages they should see the mobile version of the page and the, others should, see the desktop version if. You're using responsive, design so, if you're using the same HTML and. Just using CSS, to conditionally, change the, way the content is shown to users this, is one thing you don't need to watch out for because the HTML is exactly. The same. What's. Not immediately. Clear from, the user agents, is that, Googlebot is currently, using a somewhat older browser to, render pages it. Uses chrome 41 which was released in 2015. The. Most visible, implication. For developers, is that newer JavaScript, versions, and coding conventions like, arrow functions, aren't, supported, by Googlebot, and whatsapp. Also any API that was added after chrome, 41 currently, isn't supported, you, can check these on a site like can I use. And while, you could theoretically, install. An older version of chromium we don't recommend doing that for obvious security reasons. Additionally. There's, some api's that Googlebot doesn't support because they don't provide additional value for, search will check these out -. All. Right so. You might be thinking this sounds like a lot of work I. Don't know do I really need to do this so. A, lot. Of times Googlebot, can render pages properly, what why do I really have to watch out for this well there, are a few reasons to watch out for this first. Is if your site is large and rapidly changing for, example if you have a news website that. Has a lot of new content that keeps coming out regularly and requires quick indexing, as Tom, showed. Rendering. Is deferred from, indexing, so, if, you have a large, dynamic, website, then the. New content, might take a while to be indexed otherwise. Secondly. If you, rely on modern, JavaScript functionality, for, example if you have any libraries, that can't be transpiled, back to es5 then. Dynamic. Rendering can't help you there and that. Said we continue. To recommend using, proper, graceful degradation techniques. So that even older clients have access to your content, and. Finally, there's a third. Reason, to also look into this in, particular if you're using social. Media sites, if your site relies, on sharing. Through social, media or, through chat, applications. If. These services require, access to your pages content then dynamic, rendering can help you there too. So. When. You might you not use dynamic, rendering I think. The main aspect, here is balancing, the time and effort needed to implement and to run this, with the gains that are received.

So, Remember, implementation. And maintenance of dynamic, rendering can, use a significant, amount of server resources and, if, you see Googlebot is able to index your pages properly. Then, if, you're not making critical high-frequency, changes, to your site maybe. You don't need to actually make employment anything special most, sites should, be able to let Googlebot render their pages just, fine. Like, I mentioned if Google web can render your pages they, probably, don't need dynamic, rendering for that side let's take a look at a few tools to help you figure out what, what the situation, is. What diagnosing, rendering we. Recommend doing so incrementally first, checking. The raw HTTP, response, and then, checking the rendered version either, on mobile, or on mobile, and desktop if. You serve different content for example let's. Quick take a quick look at these. So. Looking. At the raw, HTTP. Response, one. Way to do that is to use Google search console, to gain. Access to Google search console, and to, a few other features that they have their unit. First need to verify ownership of, your website this is really easy to do there, are a few ways to do that so I'd, recommend doing that regardless, of what you're working on once, you have your site verified, you can use a tool called fetch. As Google which. Will show, the, HTTP. Response that, was received by Googlebot including, the response code on top and the. HTML, that was provided, before any rendering was done this, is a great way to double-check, what is happening on your server especially. If you're using dynamic, rendering to, serve, different content, to Googlebot. Once, you've checked the raw, response, I recommend. Checking how, the page is actually rendered, so. The tool I use for this is the mobile-friendly, test it's, a really fast way of checking Google's rendering, of a page as as, I mentioned the, name suggests, that it's made. For mobile devices, so. As you might know over, time our indexing. Will be primarily focused on the mobile version of a page we, call this mobile, first the indexing so it's, good to already.

Start Focusing on the mobile version when you're testing red-room. We, recommend testing a few pages of each, kind of page within your website so, for, example if you have an e-commerce site check, the home page some, of the category, pages and some of the detail pages you, don't need to check every, page on a whole, website because, a lot of times the, templates, will be pretty similar if your, pages render, well here. Then chances, are pretty high that Googlebot, can render your pages for search as well one. Thing that's, kind of a downside here is that you just see the screenshot, now, you don't see, the rendered HTML here, what's one way to check the HTML, well. New. For i/o I think. We launched this yesterday, we've, added a way to review, the HTML, after, render this. Is also in the mobile-friendly, test it shows you what, was created, after rendering. With the mobile Googlebot includes, all of the markup for links, for images, for structured data any invisible elements, that might be on the page after, rendering, so. What, do you do if, the page just doesn't render properly, at all. We. Also just, launched, a way to get. Full, information. About loading, issues, from a page as well on, this says part, was in the mobile-friendly test you can see all of the resources, that were blocked by Googlebot so this could be JavaScript, files or API responses. A. Lot, of times not. Everything needs to be crawled kind of like Tom mentioned for. Example also. If you have tracking pixels on a page Google, cloth doesn't really need to render those tracking pixels, but, if you use an API to pull in content from somewhere else that API endpoint, is blocked by robots.txt, then obviously, we can't pull in that content at all an, aggregate. The list of all of these issues is also, available in search console. So. When, pages fail in a browser usually. I check, the developer, console, for more information, to, see more details, on exceptions, and new. For i/o one, of the most requested features from people who make JavaScript powered, sites for search is, also. Showing the console log when. Googlebot, tries to render something this, allows you to check for all kinds of JavaScript, issues for. Example if you're using es6 or, if you just have other issues with the JavaScript, when it tries to run this, makes my life so much easier because, I don't have to help, people with all of these details, rendering, issues that much. Desktop. Is also a topic that still comes up. As you've seen and maybe some of the other assertions that stuff isn't quite dead so, you can, run, all of these Diagnostics. In the rich results, test as well this. Tool, shows the desktop version of these pages so. Now. That we've seen how to diagnose, issues what, kind of issues have we run across with, modern JavaScript, power science what patterns you, need to watch out for and, handle, well on your side. So. Remember. Tom mentioned at. The beginning of the talk something about lazy loading, images and being unsure if they're indexable well, it turns, out they're only sometimes. They fix it so, it was good to look at that depending. On how lazy loading is implemented, Googlebot, may be able to trigger it and with that may be able to pick up these images for indexing, for. Example if the images are above the fold and you're lazy loading, kind, of runs those images automatically, then Google watch will probably see that however if. You want to be sure that Googlebot is able to pick, up lazy loaded images one. Way to do that is to, use a no script tag so you can add a no script tag around a normal image and element, and we'll, be able to pick that up for image search directly.

Another. Approach is to use structured. Data on a page when. We see structured, data that refers to an image we can also pick that up for image search as. A side note for images we, don't index images that are referenced. Only, through CSS we, currently only index images that are kind. Of embedded with the structured data markup, or with image tags. Apart. From lazy. Loaded, images, there are other types of content that require some, kind of interaction, to be loaded what about tabs, that. Load. The content after you click on them or if you have infinite scroll patterns, on his site Googlebot. Generally. Won't interact with a page so, it wouldn't be able to see these there are two ways that you, can get this to. Googlebot though either, you, can preload the content, and just, use CSS, to toggle visibility on. And off that way Googlebot can't see that content from, the preloaded version or, alternately, you can just use separate URLs and navigate, the user and Googlebot, to those pages individually. Now. Googlebot. Is a patient, bot, but. There. Are a lot of pages that we have to crawl so we have to be efficient, and kind of go, through pages fairly, quickly, when pages are slow to load or rendered Googlebot, might miss some of the rendered content and since. Embedded, resources, are aggressively, cached for search rendering. Timeouts, are really, hard to test for so, to limit these problems, we, recommend making performant. And efficient, web pages which. Are hopefully already. Doing for users anyway right. Anyway. In particular, limit, the number of embedded, resources and, avoid. Artificial, delays, like timed interstitials. Like here you, can test pages with the usual, set of tools and roughly. Test rendering with the mobile-friendly testing, tool and while timeouts, here are a little bit different for indexing in general, if the pages work in the mobile-friendly test they'll work for search injecting, -. Additionally. Googlebot. Wants to see the page as a new user would, see it so, we, crawl, and render pages in a stateless way any. API, that, tries to store something locally, would not be supported so if you use any of these technologies, make. Sure to use graceful, degradation techniques. To allow anyone, to view your. Pages, even if these api's are not supported. And. That. Was it with regards to critical, best practices, now it's time to take, a quick circle back and see what we see.

So. First. We. Recommend checking, for proper implementation of, best, practices, that we talked about in particular lazy. Loaded images are really common. Second. A sample, of your pages with. The mobile-friendly test and, use the other testing tools as well remember. You don't need to test all of your pages just make sure that you have all of the templates cover and then. Finally, if pages. Are large, and if, sites are large and quick changing, and you. Or you can't reasonably, fix rendering, across the site then, maybe, consider using dynamic rendering, techniques to serve Googlebot, and other crawlers, a pre-rendered. Version of your page and finally. If you do decide, to use dynamic rendering make. Sure to double check the results there as well. One. Thing to keep in mind, indexing, isn't, the same as ranking but generally speaking pages. Do need to be indexed before their content can appear in search at all I. Don't. Know Tom do you think that, covers about everything, well. It was a lot to take in John, some. Amazing content but I guess one, question I have and I think maybe other people in the audience have this on that mind as well as is, it always gonna be this way John. That's. A great question Tom I don't know is I think, things, will never stay the same, so as, you, mentioned in the beginning this is a challenge, for us that's important, within Google search we want our search results to reflect, the, web as it is regardless. Of the type of website that's use so, our long-term version, is that you the, developers, should, need to worry as much about. This for. Search, crawlers, so. Circling, back on the diagram that Tom. Showed at the beginning with deferred rendering one. Change we want to make is to move rendering, closer, to, crawling and indexing, another. Change, we want to make is to make Googlebot. Use a more, modern version of chrome over time both, of these will. Take a bit of time I don't, like making long-term, predictions, but I suspect there will be at least until end of year until.

This Works a little bit and similarly, we trust that rendering, will be more and more common across all, kinds of web services, so, at that point dynamic, rendering is probably, less critical, for modern sites however the, best, practices, that we talked about they'll, continue to be important, here as well, how. Does that sound Tom. It. Sounds really, great, I, think, that covers everything and I hope everyone in the room, and. Has learned some new approaches, and tools, that are useful for making your, modern JavaScript powered, websites, work well in Google search if. You have any questions will be in the mobile web sandbox, area together, with the search console team and. Alternatively. You can always reach out to us online as well be it through Twitter our, office, our live office hours hangouts and in. The webmaster help forum as well so, thanks, everyone for your time thank. You. You.

2018-05-12 15:24

Show Video


Very useful information, loving the transparency.

How to make sure that Google will not consider Dynamic Rendering as a Cloaking? Previously there was a recommendation to not checking for a google bot.

Right, but I did not found this details too

He did mention that was the change in the policy they made. But i dont see any documentation on it.

I have a question: we work on an brand new site, which build on JS. We close it by robots.txt as we afraid that bot might index a lot of "empty" pages, that are without dinamic redenring... However, i am want to test and to see - how google bot will see those pages? But I can't test it until I unblock the robots.txt file, right? I mean - I even can't use GWT's "Fetch as google bot" while it is closed by robots.txt. So, what might be the solution to check how google bot will render my sites without openeing robots.txt file?

you can open up robots.txt after adding meta-noindex( on your pages, and then use google search console to see how bots would see your pages.

23:55 provides a solution of implementing server side rendering only for google bot. That might be a good solution, however, I thought that is considered as search engine cloaking (providing different result to users / bots), which will penalize your SEO... isn't it?

Seems like a rehash (pun intended) of the ?_escaped_fragment_= protocol that has been deprecated. So if you’re like me and already implemented that.. all you have to do is serve that content using bot detection instead of query detection.

It's also a ridiculous, unrealistic "solution" equivalent to "build us a special, second website that we can search, please". Ugh.

spa = bad idea

Superb work, Thanks a lot. Keep it up. :)

Thanks, Google!

Awesome info.. Thank you!

Great an helpfully information! Thanx Google!

Google, Please provide a link to the documentation regarding dynamic rendering and the official policy change.

Seems like a rehash (pun intended) of the ?_escaped_fragment_= protocol that has been deprecated. So if you’re like me and already implemented that.. all you have to do is serve that content using agent detection instead of query detection.

You're missing a 'b' in a part of the info :) "Watch more >Wemasters< sessions from I/O '18 here" Just trying to help, keep being awesome and an inspiration! :) Awesome video

Interesting point about the complete dynamic rendering for search bots user-agent and not users!

Questions: You mention using the mobile friendly tool and the rich results testing tool as rendering test platforms, essentially. Why do this instead of using Fetch and Render in Search Console? In fact the first time I tried to use the rich results tool it told me that the page was not eligible for "rich results known by this test."

The dynamic rendering is so ridiculous.... What make you think that I'm going to code like a #$%#! just to make your job simplier when implement that, requires an important infrstructure? Google many times does incredible things, but this.... this goes nowhere. I really don't think people are going to implement this, or if they try, they are going to leave it after try.....

Will GoogleBot use Chrome 59 in 2018 ? Because, you know, ES20**.

On my website I use the fragment #! and it is perfect for the users, I show the content without refreshing the whole page. But now Google does not recommend this and my site has fallen in terms of indexed pages and therefore its positioning too. I do not understand why they do not take the content that comes after #!. Google always recommends focusing on users when the site is done, but this is no longer the case. Since in my case the site works perfect for users, they see the content, but now for Google this is insignificant and if now I have to change something from my code it is for Google to interpret it. Contradictory, no? Anyway in search engines like Bing or duckduckgo this does not happen, there if they crawl all the content of my web. They say to make use of the API History, which I was trying to do and I can not make it work for my case. So, do we focus on users or search engines?

Hello, I also work that way on my website. How did you solve the fragment # !?

This technical aspect is really important for the following up of website building.Truly thanks.

My thoughts exactly.

Use React Static. Problem solved :)

@10:42 There is a coding mistake! I think you mean to write the following example: Will be Crawled The Javascript Part would not be crawled!!! Even John Mueller has said Google doesn't crawl AJAX anymore. #GoogleJobs

@10:42 There is a coding mistake! I think you mean to write the following example: Will be Crawled The Javascript Part would not be crawled!!! #GoogleJobs

I'm waiting for Dynamic Rendering to show up in Google's documentation. I hope they provide specific rules to do detection of UserAgents. Until then... I'm not going to risk a cloaking penalty! To answer your question... I use history pushstate with a if a snapshot exists. So I never actually use #! in my urls.

you ( must make a video tutorial in indonesian language too.

Other news