Saturday, August 11, 2018

Updating Mysteries of the Internet

Earlier this year I posted an article on some mysterious happenings on the internet, where a self-Google turned up random compilations of internet articles by and about me, interspersed with gibberish and references to other people named David Chan and Chinese food.   Beyond speculating that this was some sort of scam to visit the offending page, I really had no idea of what was happening.

More recently I encountered a more polished exemplar, actually entitled the "David Chan Chinese Food Blog" without the jibberish and including many illustrations from articles I had written, as well as articles written about me.   It was hosted on a website called Omniboo, which apparently created hundreds, if not thousands of similar websites on other food topics.  This probably indicates that these pages are automated creations using more sophisticated search engine technology than in past years.  Then I saw a similar page for the historic Paul's Kitchen restaurant in the City Market district of Los Angeles, with some really nice photos I hadn't seen.  Interestingly, the target of the page seemed to be for people who wanted to remodel their kitchen.

The fake Paul's Kitchen page was so interesting, I posted it on the Food Talk Central board, accompanied by a request for possible explanation of how and why the page was created.  The board's founder responded with a brief but telling answer--black hat search engine optimization.  While this doesn't explain everything I've wondered about in the past, it is a major explanation of the phenomenon which I had witnessed.   Essentially, there's nothing more valuable than getting people to come to your web page, from both a statistical point of view (the more hits the better) and for less honorable reasons (to attract people to a dangerous page, or at a minimum, a page they never otherwise would have searched for).   Black hat SEO describes nefarious strategies to accomplish the goal of getting web page visits.  

In the more primitive days of the internet, common tools to drive traffic included invisible text and stuffing.  Invisible text was a device where a search engine would register a "hit" for searched language, but that language itself was not visible to the reader of a web page.  This sort of explained the first internet mystery I encountered, when I saw that people had reached my blog via link from totally unrelated websites, but when I went to that website there was no reference to my blog page.  Now I say this sort of explains things, because it would make sense if I had done that to drive traffic to my blog, but this was actually done in the third party (often Russian) website.  Still I presume the concept of invisible text somehow explains what happens.

Subsequently I ran into websites which included my name and some of my works, along with jibberish mentioning many other topics.  This is probably an example of stuffing, where a wide variety of search terms are stuffed onto the webpage to gain a maximum number of hits.  Stuffing makes sense to me, particularly with my discovery that the stuffed website link would take me to different webpages depending on which device I was using to access the internet.  Where the Google search link would take me to a stuffed webpage mentioning my name on one device, the same link on another device would go to an advertising website, and the same link on yet another device would land on a porn site.  So that perfectly explains the gibberish websites.

Now the next generation of these fake websites is more of a puzzle.  The fake Paul's Kitchen page has a number of links on the side to subpages on the same website apparently selling various kinds of kitchen sinks (but I'm certainly not going to click on those links).   Meanwhile the fake David Chan Chinese Food Blog doesn't have any other external links, though there are links to other Omniboo food sites.  On the surface, I guess these fake pages are used to publicize the host page, but given all that is going on on the internet, who knows what the real answer truly is.

No comments:

Post a Comment