Mar. 28th, 2008 01:10 pm
armtuk: Cheetah (Default)
Training yesterday at MarkLogic was good. We got an intensive course in XPath which was good. I've used XPath a fair bit before working with XSLT, but it was good to get a more indepth tutorial on it from an expert source. We also talked about search functionality. The search functionality if very advanced, but it seems to have been built more in the traditional model that in the Google model. There are functions to OR things together, and AND things together and to do case-sensitive or case-insensitiveor diacritic insensitive search etc, but there isn't a global sort of function that you can just pass a query string to and have it work like you can with the Google search box. This makes it seem like implementing a really cool search app that can do everything from the main search box will be a bit tricky, as you are going to have to break down the query that the user typed into it's component paths before figuring out which functions you need to execute. I know that XQuery is quite powerful based on what we've been seeing over the last couple of days, but this will stretch it to the limit. I think I would opt to implement in Java for this sort of thing somehow. It's going to be interesting to see what the Mark Logic guys do for the PoC.

One of the down sides it seems is that you can only use one language for stemming for a given database, and you have to build a stemmed index on your content, which is a ranged index which means that it must fit in RAM. I'm not sure how we are going to handle content from multiple languages in this context. I can't imagine how the Solr folks are even going to begin to address this sort of stuff thats more or less built in to Mark Logic. I can see this being an interesting competition. Stemming index in RAM, I hope I'm wrong on that point too.


Mar. 27th, 2008 09:35 am
armtuk: Cheetah (Default)
Yesterday was the first day of training, and it was really good. The trainer is very knowledgeable and is a good teacher. She is explaining concepts well and giving us a very detailed look into the system. I think that I will be able to answer most if not all of the questions that I have been sent here to answer. It is clear that the folks at Mark Logic have thought a great deal about their product, and it has a very rich feature set. Some of our concerns about collation and diacritics have been allayed as it appears we can set the level of sensitivity to such things in the configuration of the server. Mark Logic has built a huge set of extensions to the standard XQuery function base allowing you to write entire applications just in XQuery which is a great feature set. There are also a significant number of APIs to the server in things like XDBC for Java and .Net, WebDAV and standard HTTP. I got a crash course in XQuery yesterday and found out some very interesting things about certain operators in XQuery. XQuery sequence operators by default return a match if any member of operand a matches any member of operand b, so (1,2,3) = (3,4) returns true, which is interesting. If you want to get a true equivalence, then you must use the function deep-equals. And if you want node equivalence then there is another function that can do that. XQuery is a purely functional language, so everything pretty much has to be a function with the exception of FLWOR statements. My initial assessment that XQuery was really just XPath plus FLWOR was a bit short of the mark however, as it appears you can define functions in XQuery allowing you to build complex systems. So it's basicaly XPath + XQuery functions + FLWOR + custom functions, so I wasn't far off, but that extra bit counts for a significant chunk of functionality. It's going to be interesting to see how the Mark Logic folks on our project do their PoC.


armtuk: Cheetah (Default)

April 2017

16171819 202122


RSS Atom

Most Popular Tags

Page Summary

Style Credit

Expand Cut Tags

No cut tags
Page generated Sep. 22nd, 2017 03:10 pm
Powered by Dreamwidth Studios