<?xml version="1.0"  encoding="utf-8"?>
	<feed xmlns="http://www.w3.org/2005/Atom">
		<title>Comments to Critique Part Four:  Site Search</title>
		<link href="http://www.practicalecommerce.com/atom/article/494/" rel="self"/>
  	<updated>2007-06-08T06:28:02-07:00</updated>
		<author>
  	  <name>Practical Ecommerce</name>
			<email>info@practicalecommerce.com</email>
  	</author>
  	<id>http://www.practicalecommerce.com/</id>
		<rights>Copyright 2007 Confluence Publishing DBA Practical Ecommerce</rights>
		<entry>
			<title>Falafulu Fisi</title>
			<link href="http://www.practicalecommerce.com/articles/494/Critique-Part-Four--Site-Search/#comment3018" rel="alternate"/>
			<id>http://www.practicalecommerce.com/articles/494/Critique-Part-Four--Site-Search/#comment3018</id>
			<updated>2007-06-08T06:28:02-07:00</updated>
			<summary>I would also recommend to anyone who is seeking to buy an off-the-shelf  content search engine product (site search engine) to be deployed in your e-commerce website, to request the vendors  &quot;Precision&quot; &amp; &quot;Recall&quot; capability of their search engine.  They are given in percentages. The higher these numbers , the better the retrieval capability of the search engine.

&quot;Precision&quot; is defined in information retrieval as : &quot;The proportion of retrieved and relevant documents to all the documents retrieved&quot;.

&quot;Recall&quot; is also defined in information retrieval as : &quot;The proportion of relevant documents that are retrieved, out of all relevant documents available&quot;

More on how effective a search engine in terms of the measurement of its  retrieval capability can be found in the link shown below:

&quot;Information Retrieval&quot;
http://en.wikipedia.org/wiki/Information_retrieval

It is important that you know these numbers from the vendor, so you can have some understanding of the ability of the...</summary>
			</entry>
			
				<entry>
			<title>Falafulu Fisi</title>
			<link href="http://www.practicalecommerce.com/articles/494/Critique-Part-Four--Site-Search/#comment3006" rel="alternate"/>
			<id>http://www.practicalecommerce.com/articles/494/Critique-Part-Four--Site-Search/#comment3006</id>
			<updated>2007-06-06T17:01:37-07:00</updated>
			<summary>Google has only integrated Latent Semantic Indexing (LSI) technology into their search engine over the last 2 or 3 years perhaps, where LSI had been published and available from literatures since the early 1990s. 

The adoption of LSI by Google is mentioned  in this article:

http://www.seobook.com/archives/000657.shtml

I don&#039;t know how Google is combining  its PageRank &amp; LSI , but it is sure that they must be computed separately and somehow combine the Indices of those into one.  The input to PageRank is a matrix of  links (this document links to other documents and vice versa), while the input to LSI is a matrix of  term-by-documents, which clearly the 2 are computed separately and then somehow combined.

There are new emerging algorithms that solve this problem which is based on Multi-linear Algebra and I haven&#039;t seen yet from literatures if any commercial application has been developed, even Google still doesn&#039;t know how, but I am sure that they are working on it.  This...</summary>
			</entry>
			
				<entry>
			<title>Falafulu Fisi</title>
			<link href="http://www.practicalecommerce.com/articles/494/Critique-Part-Four--Site-Search/#comment3000" rel="alternate"/>
			<id>http://www.practicalecommerce.com/articles/494/Critique-Part-Four--Site-Search/#comment3000</id>
			<updated>2007-06-06T04:54:56-07:00</updated>
			<summary>I would highly recommend these algorithms  for developers to improve their content search engine.  Google algorithm  PageRank is  link-based where it is different from content-based search such as the followings:

 #1) &quot;Using Linear Algebra for Intelligent Information Retrieval&quot; 
  http://lsirwww.epfl.ch/courses/dis/2003ws/papers/ut-cs-94-270.pdf
 
 #2) &quot;Probabilistic Latent Semantic Indexing&quot; 
  http://www.cs.brown.edu/people/th/papers/Hofmann-SIGIR99.pdf
 
 #3) &quot;Algorithms, Initializations, and Convergence for the Nonnegative Matrix Factorization&quot;
  http://meyer.math.ncsu.edu/Meyer/PS_Files/NMFInitAlgConv.pdf

#4) &quot;Interactive Search Grouping - Search result grouping using Independent Component Analysis&quot;
  http://www2.imm.dtu.dk/pubdb/views/edoc_download.php/825/pdf/imm825.pdf

The type of search algorithms described in those papers above are the modern state-of-the-art  content search algorithms of today. They are different from the traditional key-word search as they...</summary>
			</entry>
			
				
	</feed>