<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:wfw="http://wellformedweb.org/CommentAPI/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
	xmlns:slash="http://purl.org/rss/1.0/modules/slash/"
	>

<channel>
	<title>Alt om ingenting og litt i mellom &#187; tokenizer</title>
	<atom:link href="http://hovenko.no/blog/tag/tokenizer/feed/" rel="self" type="application/rss+xml" />
	<link>https://hovenko.no/blog</link>
	<description>En blogg av Knut-Olav</description>
	<lastBuildDate>Mon, 10 Mar 2025 19:25:02 +0000</lastBuildDate>
	<language>en</language>
	<sy:updatePeriod>hourly</sy:updatePeriod>
	<sy:updateFrequency>1</sy:updateFrequency>
	<generator>http://wordpress.org/?v=3.2.1</generator>
		<item>
		<title>Don&#8217;t sort on tokenized strings in Solr</title>
		<link>https://hovenko.no/blog/2009/12/15/dont-sort-on-tokenized-strings-in-solr/</link>
		<comments>https://hovenko.no/blog/2009/12/15/dont-sort-on-tokenized-strings-in-solr/#comments</comments>
		<pubDate>Tue, 15 Dec 2009 07:46:28 +0000</pubDate>
		<dc:creator>Knut-Olav</dc:creator>
				<category><![CDATA[English-posts]]></category>
		<category><![CDATA[Programmering]]></category>
		<category><![CDATA[search]]></category>
		<category><![CDATA[Solr]]></category>
		<category><![CDATA[sort]]></category>
		<category><![CDATA[tokenizer]]></category>

		<guid isPermaLink="false">http://hovenko.no/blog/?p=392</guid>
		<description><![CDATA[Apache Solr is a very powerful index and search engine. Unfortunately it does have some flaws, at least I think this issue is somehow not &#8220;by design&#8221;. If you are going to use a field to sort on, make sure you use one of the native data types in Solr, and don&#8217;t enable any tokenizer [...]]]></description>
			<content:encoded><![CDATA[<p><a href="http://lucene.apache.org/solr/">Apache Solr</a> is a very powerful index and search engine. Unfortunately it does have some flaws, at least I think this issue is somehow not &#8220;by design&#8221;.</p>
<p>If you are going to use a field to sort on, make sure you use one of the native data types in Solr, and don&#8217;t enable any tokenizer on that data type. If you do, you might end up with HTTP 500 Internal Server Error and error log messages like this:</p>
<blockquote><p>
SEVERE: java.lang.RuntimeException: there are more terms than documents in field &#8220;title&#8221;, but it&#8217;s impossible to sort on tokenized fields
</p></blockquote>
<p>I found out that I had been using a data type with some filters and a tokenizer on a couple of fields, quite unnecessary since I don&#8217;t do any search on them. I have another field that I do search on. I only use these fields for display and sort.</p>
<p>Keep it simple. Use &#8220;string&#8221; for strings you don&#8217;t have to search on. If you have to do both search and sort on a field, make two fields. For example, name one of them like &#8220;title.sort&#8221;.</p>
]]></content:encoded>
			<wfw:commentRss>https://hovenko.no/blog/2009/12/15/dont-sort-on-tokenized-strings-in-solr/feed/</wfw:commentRss>
		<slash:comments>1</slash:comments>
		</item>
	</channel>
</rss>
