CommunityNews

CommunityNews

Finding Dead Websites

As some of the work planned for Marginalia Search this year has been progressing a bit faster than anticipated, there was time to implement an unplanned change.
This post details the implementation of a system for detecting when servers are online, to avoid serving dead links and improve data quality, and for detecting when websites have significant changes including ownership transfers and parking.
Table Of Contents Feature Rationale Data Representation Live Data Event Data Change Detection Details Availability Detection Ownership Changes DNS Implementation Hurdles Scheduling Certificate Validation Conclusions Feature Rationale Availability detection is useful not just for filtering out dead links in the search results, but for informing the crawler that it should stop trying to reach a dead domain, as well as a host of other things.

Read in full here:

Where Next?

Popular General Dev topics Top

First poster: bot
Cybersecurity leaders discuss business resiliency and identity challenges during a session at VMworld 2020. https://www.techrepublic.co...
New
First poster: bot
How to have a Neovim configuration compatible with Vim. So you can have your cake and eat it too.
New
First poster: dyowee
Everyone seems to be striving for ‘clean’ code at the moment. You can’t read a blog post without the author telling you how clean their a...
New
First poster: joeb
The File System Access API with Origin Private File System. WebKit supports new API that makes it possible for web apps to create, open,...
New
CommunityNews
David Dahan. Hi, I’m David, a CTO and a former startup founder, currently offering consulting services as a CTO as a service.
New
First poster: bot
API Gateway Trends behind Features: Apache APISIX 3.0 vs. Kong 3.0 - API7.ai. By comparing the open-source API Gateway Apache APISIX and...
New
First poster: Korbin73
Whatever happened to Elm, anyway?. I see this question pop up quite frequently in lots of different arenas - folks are curious as to wha...
New
First poster: FatimaAdamu
Two US lawyers fined for submitting fake court citations from ChatGPT. Law firm also penalised after chatbot invented six legal cases th...
New
First poster: jkdiaz
Dark mode isn’t as good for your eyes as you believe. The shadowy display mode has leagues of fans claiming it helps reduce eye strain, ...
New
New

Other popular topics Top

Devtalk
Reading something? Working on something? Planning something? Changing jobs even!? If you’re up for sharing, please let us know what you’...
1023 17214 380
New
siddhant3030
I’m thinking of buying a monitor that I can rotate to use as a vertical monitor? Also, I want to know if someone is using it for program...
New
dasdom
No chair. I have a standing desk. This post was split into a dedicated thread from our thread about chairs :slight_smile:
New
AstonJ
You might be thinking we should just ask who’s not using VSCode :joy: however there are some new additions in the space that might give V...
New
AstonJ
Just done a fresh install of macOS Big Sur and on installing Erlang I am getting: asdf install erlang 23.1.2 Configure failed. checking ...
New
PragmaticBookshelf
Learn different ways of writing concurrent code in Elixir and increase your application's performance, without sacrificing scalability or...
New
AstonJ
Seems like a lot of people caught it - just wondered whether any of you did? As far as I know I didn’t, but it wouldn’t surprise me if I...
New
AstonJ
If you want a quick and easy way to block any website on your Mac using Little Snitch simply… File > New Rule: And select Deny, O...
New
New
First poster: bot
zig/http.zig at 7cf2cbb33ef34c1d211135f56d30fe23b6cacd42 · ziglang/zig. General-purpose programming language and toolchain for maintaini...
New