Return to Jive Software

1 2 3 4 ... 7 Previous Next

Jivespace Community Blog

93 Posts tagged with the clearspace tag

Learn more about using web services to access your data in Clearspace 2.0 with Andrew Wright, Jive Software Engineer.

 

 

You can also download the Quicktime version (Caution: file is ~232MB), or you can watch a larger version online, which will improve readability of embedded screenshots (recommended).

 

The entire presentation is also attached below as a PDF file.

1,333 Views 0 Comments Permalink Tags: clearspace, plugins, web_services, web_services, soap, rest, 2.0

During the last couple weeks of the 2.0 development cycle we pushed some really helpful search improvements (some of them bug fixes) into Clearspace. There are a numberof posts scattered around our intranet (which is called Brewspace) where the actual improvements / bug fixes are discussed but I don't believe those improvements made it into the documentation (there are a number of improvements listed in the changelog, but no description of what the improvements are). Hence this blog post.

 

First, in 2.0, the default search operator was changed to 'AND' from 'OR', the end result being that if you did a search like this:

clearspace openid

Clearspace would look for all the blog posts, discussions and documents that contained the term "clearspace" AND contained the term "openid". Way back in Clearspace 1.0 the thought was that we should deviate from what Google does (they AND the terms you input) because we're not searching the entire web; our thinking was that most of our installations would only have a couple thousand documents, blog posts and threads and so we didn't ever want a search for 'clearspace openid ldap' to return nothing if there was a document that discussed two of the three. The reality is that when the search operator was 'OR' the number of results from a search query in Clearspace was almost always greater than 500 results (the maximum number of results we would return in a search): in fact, the more words you used in your query, the more likely that you'd end up with a large number of results, which in theory is great (we found a bunch of stuff that for you!) but in practice doesn't make for a great user experience (thirty four pages of search results? come on!). One of the articles (discussed below) had this to say about lots of search results:

Users sometimes measure the success of a query primarily by the number of results it returns. If they feel the number is too large, they add more terms in an effort to bring back a more manageable set.

So not only did "OR" result in more results per query, if you decided that you wanted to refine your query by adding a term, the number of results would actually grow, not shrink, which is the opposite of what you'd expect.

 

The funny thing about changing this default is that when we turned it on for brewspace (our intranet, no other changes had been made at that point), a number of people noticed right away and were amazed at the 'improvement'. It's really crazy how something as simple as "AND" versus "OR" could make such a big difference in user experience.

 

Before I move on, here are a couple interesting articles I found that talk about search as it relates to user experience:

  • Greg Linden, an ex-Amazon guy, wrote a great blog post a couple months agothat summarized a talk that Marissa Mayer gave about Google and their results page and the number of results per page and also added some notes about his own experience while working at Amazon. The bottom line from the presentation? Speed kills. The faster that we can return search results, the happier Clearspace users will be (although the comments on that post tell a potentially different story, don't miss'em).

  • A BBC News article from 2006 had this to say about search results:

At most, people will go through three pages of results before giving up, found the survey by Jupiter Research and marketing firm iProspect. It also found that a third of users linked companies in the first page of results with top brands. It also found 62% of those surveyed clicked on a result on the first page, up from 48% in 2002. Some 90% of consumers clicked on a link in these pages, up from 81% in 2002. And 41% of consumers changed engines or their search term if they did not find what they were searching for on the first page.

Takeaway? Relevant results are more important than many results.

The essential problem of search — too many irrelevant results — has not gone away.

More and more, our ongoing research is telling us that Search has to be perfect. Users expect it to "just work" the first time, every time.

 

One thing that was easy to add which has also come up a couple times was search by author. I'm happy to report that in 2.1 we added the ability to search for content authored by a specific user. So just like you can click 'more options' on the search results page today and choose what types of content you want to search for, you'll be able to select a user whose content you want to find and filter the results using that selection.  Side note: that functionality has always been in our API and if you're a URL hacker like I am, you can actually perform Clearspace searches using a pretty URL like this:

http://example.com/clearspace/search/openID

or if you want to search for any content written by the user 'aaron' that contains the word 'openID', you'd use this URL:

http://example.com/clearspace/search/~aaron/openID

 

Another thing that I believe has worked in the past but that we haven't talked about is the idea of a simple syntax for search. Much like the operators you can use in Google (ie: 'site:jivesoftware.com lucene' will find all the references to 'lucene' on the domain 'jivesoftware.com'), we now support the following operators: 'subject:', 'body:', 'tags:' and 'attachmentstext:'. While I admit that they're not the most user-friendly things to type in, it does give advanced users a little bit more flexibility. For example: you can now ignore tags if you want by doing a search like this: 'subject:lucene OR body:lucene'. The search syntax operators are scoped to be in the search tips documentation that sits right alongside the search box. Again, this is for 2.1.

 

Those were the improvements. Now the bug fixes (which just so happen to also really improve your searching experience). 

  • Search stemming doesn't seem to be working (CS-3645): Not sure how long this wasn't working, but the existence of this bug meant that if you put the word "error" in a document and then did a search for "errors", our search engine wouldn't find your document. Read more about stemming if you're curious about that sort of thing. If you're seeing this bug, make sure a) you upgrade to at least Clearspace 2.x and b) make sure that you're using a stemming indexer. The default analyzer does not stem. You can change the indexer by going to the admin console --> system --> settings --> search --> search settings tab --> indexer type.

  • Group search results by thread setting in admin console doesn't change search behavior (CS-3656): I'm not sure how long this feature has been around, but there is a search setting in the admin console that gives you the ability to group all messages in a thread into one result in a search results page so that the messages in a single thread don't overwhelm the search results (since messages share the subject and tags from the thread, that actually happened quite a bit). Fixed in 2.0, I highly recommend turning it on in your instance if you haven't already.

  • Search updates to better balance queries across content types (CS-3638): Some improvements were made in 2.0.0 toward this issue, but it's 100% fixed in 2.0.3 and 2.1. There were two really big but really really hard to see problems with the way that we were executing our search queries. First, a quick background on how a search query is performed in Clearspace against all content types. We have a single Lucene index for all the content in Clearspace (there is a separate index for user data, but that's a different story) so when a search for 'bananas' is executed, we did something like this (don't read too much into the language I'm using, I'm just trying to illustrate how it works at a 30,000 foot level):

  1. get blog posts that match query

    • find all the blog posts where the subject matches 'bananas' OR the body matches 'bananas' OR the tags matches 'bananas' OR the attachments matches 'bananas' or the blogID matches 'bananas'

  2. get discussions that match query

    • find all the messages where the subject matches 'bananas' OR the body matches 'bananas' OR the tags matches 'bananas' OR the attachments matches 'bananas' or the threadID matches 'bananas'

  3. get documents that match query

    • find all the documents where the subject matches 'bananas' OR the body matches 'bananas' OR the summary matches 'bananas' OR the fieldsText matches 'bananas' OR the tags matches 'bananas' OR the attachments matches 'bananas' or the documentID matches 'bananas'

  4. merge results from steps 1-3 using the relevance score from each item in the result set as the comparator

  5. display results

The assumption we made when writing this code was that the scores that Lucene returns for each item of all content type will be relatively similar. More concisely, if I had a document and a blog post which for some reason had identical content, I'd expect they would both have the exact same Lucene relevance score if they came up in the results of a search. But that assumption turned out to be wrong, not once but twice.

 

First, as you can see from glancing at the sample queries I pasted above, we searched a different number of fields per content type. Who cares right? Why would the number fields that you search on influence anything? Turns out that Lucene cares: the way scoring works in Lucene is that if you search on ten fields and only get a hit on two of them in document 'X' that the resulting relevance score should be less than a hit on blog post 'Y' where you search on five fields and get a hit on two of them. It makes perfect sense when you think about it: it's just like the tests you had in school. Getting a 4 out of 5 on a test works out to be 80%, about a B. If you got 4 out of 10 on a test, that's 40%, you failed. You probably called them grades... maybe sometimes even a score, which just so happens to be exactly how Lucene refers to the relevance that a particular document has to a given query (if you're curious about how Lucene does scoring / relevance you should check out the JavaDoc for the Similarity class and also read this document on scoring). Anyway, this behavior was actually fixed in 2.0: so now when we execute a search on mixed content we search the exact same number of fields for each content type: subject, body, tags, attachmentsText.

 

The second assumption that turned out to be wrong was just as nebulous. I illustrated above how we search in Clearspace: we do searches for each content type and then we merge the results of those searches into a single result set. In order to do a query for just blog posts that match the word 'token', we do a query that in Lucene looks like this:

+objectType:38 +(subject:token body:token attachmentsText:token tags:token)

It kinds of looks like a SQL subselect: get me all the things where one of subject, body tags or attachmentsText match and then, from those results, only return the results where the objectType is 38 (which is the int that JiveConstants.BLOGPOST refers too). The thing that killed us here was outer statement

+objectType:38

because:

a) when Lucene executes a query, it computes a query weight and field weight for each statement in your query and multiplies those two values together to get the total weight for that statement and

b) the query weight is basically a measure of how many times the key (in this case 'objectType') appears divided by the number of times the value appears (in this case '38') which means that

c) content objects that you have less of (in our case: blog posts) will tend to have a score much higher than content objects that you have a lot of (in our case: documents). Again, this makes sense: items that appears in the index a relatively small number of times are in some sense rare and so they should get a relatively higher weight. Regardless, it turns out there's a really easy fix for this problem as well: you can boost specific fields in your query like this:

subject:token^3

and you can effectively neuter a field by boosting it to zero:

subject:token^0

which means that Lucene will look for all the items in the index whose subject is 'token' but the weight, which usually influences the score that it assigns to the field 'subject' will not influence the score that the resulting item receives. 

 

We're continually looking to improve the search tools in Clearspace. If you're seeing something you don't expect or if there's something cool you'd like us to add, please pipe up in our Support and Feature Discussion spaces here on Jivespace.

 

 

1,482 Views 0 Comments Permalink Tags: clearspace, search, lucene, improvements, stemming

We just launched our new Jivespace Plugin Directory for Clearspace plugins! You can now more easily find and download plugins from Jivespace, and if you are developing plugins, you can add your plugins to the new directory.

 

Plugin Directory

 

The really cool part is that we created the entire plugin directory as a plugin to Clearspace. Our web engineer, Tim, took our existing document content type and extended it to create a new plugin type with additional information relevant to plugins along with a new look and feel for the plugin directory.  He also used the plugin jar to pull almost all of the metadata displayed with the plugins, including license, logo, readme, compatible versions, and much more. For developers, this means that you only need to enter the information in your plugin, instead of having to duplicate all of the information by filling out redundant forms. It also means that when you update your plugin jar file with a new release, the plugin information will be automatically updated.

 

Tim is currently working on polishing the code a bit, but he will be releasing this as a plugin for other people to use as a plugin marketplace. It also provides a very useful example of how to extend an existing Clearspace content type to create a new content type in Clearspace.

 

Keep in mind that this is just the first version of our Plugin Directory, and we plan to start making incremental improvements and enhancements. But first, we want to hear from you. Take a look at the new Jivespace Plugin Directory. What do you like? dislike? How can we make this even better in future revisions?  Please leave comments with your ideas on this blog post!

 

 

 

 

1,111 Views 1 Comments Permalink Tags: jivespace, clearspace, plugins, plugin_directory, plugin_directory

Learn all about how to write new widgets for Clearspace 2.0 from Aaron Johnson, Engineering Manager at Jive.

 

 

You can also download the Quicktime version (Caution: file is ~285MB), or you can watch a larger version online, which will improve readability of embedded screenshots (recommended).

 

The entire presentation is also attached below as a PDF file.

 

You can also watch an earlier video about developing widgets for Clearspace 2.0

1,480 Views 0 Comments Permalink Tags: plugin, jivespace, jivespace, podcast, video, clearspace, widget, 2.0

This is the second blog post about real time communication support in Clearspace that started with Connecting a chat client to Clearspace. Today we are going to cover chat events that are going to be available in Clearspace 2.1. Lets start giving some personal examples to illustrate real usages for chat events.

 

  1. Every Wednesday at 10:00 AM PST developers of igniterealtime.org projects join the chat service to answer development questions. Members of the community can join the chat service using their XMPP client of choice or using the web client. Moreover, users of other XMPP servers can also join the chat service as long as server-to-server is enabled.

  2. Every Monday morning developers of the Real Time Communication team, that includes local and remote developers, join a chat room to discuss status updates and goals for the week.

  3. Last week I was invited to a meeting to discuss some technical problem about clustering.

 

In the above examples we see different usages for chat events. The first example is showing a repetitive chat event that is not associated to any space, project or social group but to the entire site. In the second example we have a project, but it could be a space or social group, whose members meet every week to discuss their work. As you would expect, permissions of the container are applied to determine who can join the group chat. And finally, we see that scheduled chats could be a one time only activity.

 

A chat transcript is created for each occurrence of the chat event. When the event is over the transcript is "closed" and moderators can moderate/edit it. As any content in Clearspace, transcripts are indexed on real time and can be searched just like you search for a document or a blog.

 

A persistent room in Openfire is created for each chat event. A new groupchat service was added to Openfire that interacts with Clearspace to control who can create, join, configure and delete rooms. As I said in my previous blog, it is possible to use your own XMPP client to join a chat event or you can just chat from the Clearspace site. File transfers in rooms was also added to Openfire but the Clearspace side is not ready yet so the feature will be ready for the next release of Clearspace. However, since you can use your own XMPP client it is still possible to make use of the new functionality to share files in a room. The file transfer feature is based on WebDAV File Transfers

 

Next week we will cover other type of conversations that are going to be part of Clearspace 2.1.

 

 

 

2,711 Views 3 Comments Permalink Tags: clearspace, integration, rtc, openfire

Learn the basics of how to develop a plugin for Clearspace 2.0 from Jive engineer, Jon Garrison. Jon talks about spring, struts, and more in this video.

 

 

You can also download the Quicktime version (Caution: file is ~140MB), or you can watch a larger version online, which will improve readability of embedded screenshots (recommended).

 

The entire presentation is also attached below as a PDF file.

 

1,241 Views 0 Comments Permalink Tags: podcast, video, clearspace, plugins, customization, customization

Did you know that we provide free licenses of Clearspace for open source projects and user groups (like JUGs, for example)? I'm guessing that a few of you have side projects working with open source communities or local developer user groups who might be interested in using Clearspace for collaboration.

 

Here are a few groups using our free licenses already:

 

You can learn more about the requirements and request a free license on Jivespace.

 

1,878 Views 2 Comments Permalink Tags: clearspace, free, license, opensource, opensource

As many of you know we have been working heavily for the last months integrating Openfire with Clearspace. This is the first of a series of blog posts that will cover the things that you are already able to do in Clearspace 2.0 when using Openfire 3.5 and new things that you will be able to do in Clearspace 2.1.

 

I will start first describing what is Openfire. Openfire is the award-winning, open alternative to proprietary instant messaging. It uses the only widely adopted open protocol for instant messaging, XMPP (also called Jabber). Since XMPP is a standard protocol, it means that clients that understand the protocol can connect to the server. Example of clients are: Pidgin (ex Gaim), Adium, Trillian, Psi, Spark and many others. Moreover, you can also use web clients like meebo or our own SparkWeb client to connect to Openfire.

 

Openfire can be configured to read the list of users, groups and user authentication from different backends. As of Openfire 3.5 we added the option to instruct Openfire to obtain that information from Clearspace. That means that if you have an account in Clearspace then you can use the same credentials to connect to Openfire and chat with other Clearspace users. If server-2-server is enabled on the Openfire server then you can also chat with GTalk users or other users of other Clearspace instances that installed Openfire. Moreover, you can also chat with AOL, MSN, Yahoo or ICQ users by just installing the gateway plugin in Openfire.

 

One popular feature in Openfire is called shared groups. Shared groups are groups that are pre-populated in the contact list of your chat client from the server. Clearspace 2.1 will use the shared group functionality to automatically expose your social groups in your roster. That means that if you are part of a social group then all the members of that group will appear in your roster. Next in the list is to expose project team mates and lastly in the list is your friending network.

 

Next week we are going to cover how to use groupchat from Clearspace.

 

3,244 Views 5 Comments Permalink Tags: clearspace, integration, rtc, openfire

Theming in Clearspace 2.0

Posted by Dawn Foster May 12, 2008

As you know, we changed a few things in our underlying architecture for Clearspace 2.0, including some changes in the Freemarker templates as a result of moving from Webwork to Struts along with some other changes. In this video, Matt Walker, Professional Services Engineer at Jive Software, talks about the process of upgrading existing themes along with plenty of best practices to make your themes more easily upgradeable in the future.

 

Matt also did an earlier screencast as an Introduction to Skinning Clearspace, which you might also want to watch along with this video.

 

 

You can also download the Quicktime version (Caution: file is ~200MB), or you can watch a larger version online, which will improve readability of embedded screenshots (recommended).

 

The entire presentation is also attached below as a PDF file.

1,422 Views 2 Comments Permalink Tags: jive_software, jivespace, jivespace, podcast, video, clearspace, themes, freemarker, customization, struts, struts

I wanted to remind everyone that we have a Jivespace weekly group chat scheduled for tomorrow (and every Thursday) from 9-10am Pacific time. During this hour, you can ask any questions about Clearspace development topics to the engineers who wrote the software.

 

Do you have questions about

  • how we are using Spring, Struts, Acegi, and more in Clearspace 2.0?

  • a particularly difficult customization?

  • writing plugins and widgets?

  • accessing Clearspace data from other sites using web services?

  • any other developer topic?

 

Please feel free to drop in anytime during the hour to ask questions. We also post all of the chat transcripts to Jivespace.

937 Views 0 Comments Permalink Tags: jivespace, jivespace, community, clearspace, chat

Learn about the architectural and other technical changes that we made in Clearspace 2.0 from Jive engineering manager Nick Hill along with an overview of the new features from Clay Moore, Jive product manager.

 

This 20 minute video covers Spring, Acegi, Struts and more on the technical side. New features including personalized home pages, projects, organizational relationships, and document sharing are also reviewed in the video.

 

 

 

You can download the Quicktime Movie version (Caution: 250 MB file) or watch a larger Flash version.

 

The slides from this presentation are attached below.

1,240 Views 4 Comments Permalink Tags: podcast, video, clearspace, clearspace_2, clearspace_2

Have you ever wanted to display your external (public) corporate blog inside the internal Clearspace instance used by your employees? Would you like to display content from your personal blog within your Clearspace instance?

 

The Feed Your Blog plugin gives you the ability to do both of the above and more. This plugin allows your Clearspace instance to periodically poll an RSS or Atom feed and have it post any new entries that it finds to a blog that you specify.

 

You can get this plugin and other plugins by visiting our plugin page. You can also view the source code of our plugins by browsing our svn repository.

1,081 Views 0 Comments Permalink Tags: plugin, clearspace, plugins, plugins, rss, feeds

DWR in Clearspace 2.0

Posted by Dawn Foster Apr 18, 2008

Aaron Johnson, Jive Engineer, presented to our engineering team about how DWR is used in Clearspace 2.0. He started by walking us through an overview of DWR. After the overview, he showed us exactly how he used DWR in his FeedBlog plugin.

 

This 7 minute video has the highlights from his presentation.

 

 

You can watch a larger Flash version or download the Quicktime Movie version (Caution: 1761MB file)

1,139 Views 0 Comments Permalink Tags: jivespace, podcast, video, video, clearspace, 2.0

Here are some resources for anyone wanting to migrate to Clearspace 2.0 or just learn more about the development environment for Clearspace 2.0.

 

A week ago, we did a series of presentations about Clearspace 2.0 development. We also videotaped all of the presentations, but the editing will take some time for the video, so I wanted to go ahead and share PDFs of the presentations now with the Jivespace community. The videos should be coming out at a rate of 1-2 per week over the next few weeks.

 

I have attached 5 presentations, and I suggest reading them in this order:

  • Clearspace 2.0 Overview

  • Theming

  • Plugins

  • Widgets

  • Web Services

 

 

1,717 Views 0 Comments Permalink Tags: jivespace, clearspace, developers, developers, 2.0

Clearspace 2.0.1 Released

Posted by Dawn Foster Apr 18, 2008

Clearspace 2.0.1 was released last night. It has a number of bug fixes over the 2.0.0 release.

 

It also has significant improvements to the source build. Several people posted issues  with our source build here on Jivespace, and we think that this version should resolve those issues.

 

Existing customers can download the new source build or the new application files from your "My Account" page. If you want an evaluation version of Clearspace or Clearspace Community 2.0.1, you can find it on the Jivespace downloads page.

1,137 Views 5 Comments Permalink Tags: jivespace, clearspace, 2.0
1 2 3 4 ... 7 Previous Next