Creating an RSS Feed in JHTML

Because we use an older version of ATG Dynamo (6.2.0) at my place of employment, we’re sometimes constrained in what we can do and aren’t able to take advantage of features in newer versions of Dynamo. One of those features is the ability to generate an RSS feed from a content repository (which I believe is available now as part of the suite of personalization features ATG offers). If you have the ability to deploy droplet code, there’s certainly an RSSFeed module. But if all you need is a basic RSS feed for a particular collection, the following is what I came up with.

<?xml version="1.0" encoding="UTF-8" ?>
<importbean bean="/atg/targeting/TargetingForEach">
<importbean bean="/atg/dynamo/droplet/Switch">
 
<%-- import files --%> 
<%@ page import="java.util.*, java.text.*" %>
 
<%-- Generate the RSS XML --%>
 
<%-- Set the rss type --%>
<rss version="2.0" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:slash="http://purl.org/rss/1.0/modules/slash/" xmlns:wfw="http://wellformedweb.org/CommentAPI/">
   <channel>
      <title>News</title>
      <link>http://www.foobar.com/</link>
      <description>News and information.</description>
   <textInput>
      <title>News</title>
         <link>http://www.foobar.com/</link>
         <description>News and information.</description>
   </textInput>
   <skipHours>
      <hour>0</hour>
   </skipHours>
      <language>en-us</language>
      <webMaster>webmaster@foobar.com</webMaster>
 
      <droplet bean="/atg/targeting/TargetingForEach">
         <param name="targeter" value="bean:/atg/registry/RepositoryTargeters/News/archiveNews">
         <param name="fireContentEvent" value="false">
         <param name="firecontentTypeEvent" value="false">
         <param name="sortProperties" value="-postingDate">
         <oparam name="output">
            <droplet bean="/atg/dynamo/droplet/Switch">
               <param name="value" value="param:element.linkURL">
               <oparam name="unset">
                  <item> 
                     <title><valueof param="element.title"/></title>
                     <description><valueof param="element.description"/></description>
                     <link>http://www.foobar.com/news/news.jhtml?reposid=<valueof param="element.repositoryId"/></link>
                     <pubDate><valueof param="element.postingDate" converter="Date" date="EEE, dd MMM yyyy HH:mm:ss Z"/></pubDate>
                  </item>
               </oparam>
               <oparam name="default">
                  <item> 
                     <title><valueof param="element.title"/></title>
                     <description><valueof param="element.description"/></description>
                     <link><valueof param="element.linkURL"/></link>
                     <pubDate><valueof param="element.postingDate" converter="Date" date="EEE, dd MMM yyyy HH:mm:ss Z"/></pubDate>
                  </item>
               </oparam>
            </droplet>
         </oparam>
         <oparam name="empty">
            <p>No archived news</p>
         </oparam>
      </droplet>
   </channel>
</rss>

A couple of the important things with an RSS feed is to have a properly formatted date and to properly include the various elements and the like. Some readers are more forgiving than others. The date version here is the RSS 2.0-compatible RFC822 and not the older ISO8601, though you can read all about the ins and outs of RSS dates. Many elements are optional but the output may display strangely depending on the reader and are included here primarily for consistent display with the built-in readers on IE and Firefox.

Comments Off

A Breadcrumb Story

Where I work, one of the site features is a breadcrumb trail on all pages allowing linking through the content hierarchy. All of the breadcrumbs in the breadcrumb trail was generated using a rather lengthy Javascript file. The main reason for the length was due how the folders were named and the number of exceptions or special cases. As a note, for this to work all directory names were CamelCase (lowerCamelCase, to be precise) except where they were legacy items or inconsistently named (increasing the number of special cases that needed handling).

But it worked better than much breadcrumb code I’ve seen, so all was well and good until two things happened. The first was that mysterious 404 errors began to crop up in our logs. I finally traced them back to bots crawling the site, coming across the pieces of the Javascript that constructed the links in the breadcrumbs and indexing them as valid (not processed) links. So our logs were littered with failures to find pages like about_gh/.*/badministrator/b.*)/i, or /health_plans/+howmany[i+2]+. The second thing that occurred was we began developing a new version of one of our sites and it didn’t make sense to use the existing Javascript since the majority of it’s special cases were for our main site. Further, we would just be perpetuating the log errors caused by misbehaving seach engine bots.

So, for the new site, I rewrote the Javascript as a JHTML include. Since we’re using Dynamo and haven’t migrated any of our dynamic pages to be JSP pages (and encountered some problems with that due to the older version we’re running on), I learned some interesting things about using Java inline on a single page like this.

The first part of the page requires declaring any imports, like so:

<java type="import">
	java.util.regex.*;
</java>

The second part of the page is the main Java code that gets the URL and performs some operations on it:

<java>
	String originalURL = request.getRequestURL().toString();
 
	if (originalURL.indexOf("somejavaAppPathID") != -1) {
		if (originalURL.indexOf("javaApp") != -1) {
			breadcrumbPathString += divider + "<a href=\"/somespecialidentifierParent/index.jhtml\">JavaAppParent</a>" + divider + "<a href=\"/somejavaAppPathID/javaApp/JavaAppMethod?forwardUrl_success=/someAppResultLocation/javaApp/index.jhtml\">JavaApp</a>";
		}
	} else if (originalURL.indexOf("pageNameWithParam.jhtml") != -1) {
			//itemKey is the parameter that has the path to the content
			String paramKey = "paramname";
			breadcrumbPathString = makeTrail(request.getParameter( paramKey ), baseLoc, divider);
	} else {
		breadcrumbPathString = makeTrail(request.getRequestURI(), baseLoc, divider);
	}
 
	breadcrumbPathString = fixDirectories(breadcrumbPathString);
 
	out.print(breadcrumbPathString);
</java>

Originally, my thought was to use the Request Scheme like this:

String thisRequestScheme = request.getScheme();
if (thisRequestScheme.startsWith("https") == true) {
	baseLoc ="https://"+ request.getServerName() ;
} else{
	baseLoc ="http://"+ request.getServerName() ;
}
if (request.getServerPort() != 0) {
	baseLoc += ":"+request.getServerPort();
}

But since our server configuration includes F5 load balancers, that was incorrect and the request object contains different information than the URL we’re looking for. I’d prefer to use this implementation, but the previous one starting with request.getRequestURL() will work.

Lastly, I created some methods for handling special cases and doing some other work on the URLs:

<java type="class">
	public String makeTrail(String breadCrumb, String urlPath, String thisDivider) throws IOException {
		urlPath += "/";
		String outputString =  "<a href=\""+urlPath+"index.jhtml\">Home</a>";
		String linkName = "";
		String linkString = breadCrumb;
		breadCrumb = removeDirectories(breadCrumb);
		if (breadCrumb != null) {
			String[] breadCrumbArr = breadCrumb.split("/");
			String[] linkStringArr = linkString.split("/");
			if(breadCrumbArr.length!=0) {
				for(int i=1; i<linkStringArr.length-1; i++) {
					urlPath += linkStringArr[i]+"/";
					if (breadCrumbArr[i] != null && breadCrumbArr[i].length() > 0) {
						linkName = makeProper(breadCrumbArr[i]);
						outputString +=  thisDivider+"<a href=\""+urlPath+"index.jhtml\">"+linkName+"</a>" ;
					}
				}
			}
		}
		return outputString;
	}
 
	public String makeProper(String theString) throws IOException {
		StringReader in = new StringReader(theString);
		boolean precededBySpace = true;
		boolean precededByCap = true;
		StringBuffer properCase = new StringBuffer();
		while(true) {
			int i = in.read();
			if (i == -1)  break;
			char c = (char)i;
			if (Character.isSpaceChar(c)) {
				properCase.append(c);
				precededBySpace = true;
			} else if (Character.isUpperCase(c)) {
				if (precededByCap) {
					properCase.append(c);
				} else {
					properCase.append(' ');
					properCase.append(c);
				}
				precededByCap = true;
			} else if (c == '-'||c == '_') {
				properCase.append(' ');
				precededBySpace = true;
			} else {
				if (precededBySpace) {
					properCase.append(Character.toUpperCase(c));
				} else {
					properCase.append(Character.toLowerCase(c));
				}
				precededBySpace = false;
				precededByCap = false;
			}
		}
		return properCase.toString();
	}
 
	public String removeDirectories(String theString) {
		//completely remove these directories from the breadcrumbs
		theString = replaceWith(theString,"ignore/","/");
		theString = replaceWith(theString,"anotherDirectoryToSkip/","/");
		return theString;
	}
	public String fixDirectories(String theString) {
		//replace these nonsense names with real words
		//    theString.replaceWith("(?i)\\babout[ _]us\\b","About Us"); //A weird special case since the directory name is not CamelCase
		theString = replaceWith(theString,"(?i)\\bpeople\\splaces\\b","People &amp; Places");
		theString = replaceWith(theString,"(?i)\\babbr\\b","Full Unabbreviated Directory Name");
		return theString;
	}
 
	private static String replaceWith(String aPath, String aPattern, String aReplacement ){
		Pattern pattern = Pattern.compile(aPattern, Pattern.CASE_INSENSITIVE);
		Matcher matcher = pattern.matcher(aPath);
		return matcher.replaceAll(aReplacement);
	}
</java>

In reviewing this code, one thing I was unsure about was whether using

String originalURL = request.getRequestURL().toString();

was the correct way to go or if I should have used

String path = request.getPathTranslated();
path = replaceBacklash(path);

I realized that, for the way the site functions, we may pass an item parameter and we want to properly build the path to that item, not the page rendering the item. Using getRequestURL handles both situations.

Comments Off

JHTML Parameter Munging with Java

On our site, we have a search engine that does a file system crawl and returns a list of items, generating the URL to that item as it goes. We recently started using Teamsite forms for editors to create content and save it as XML. That XML is then rendered using a custom JHTML page that applies the appropriate XSL sheet and does some other work, but there’s no way to inform the search engine when that’s the case (at least not without splitting search into multiple repositories and doing a bunch of other work with the repositories).

One possible solution that I came up with, before abandoning it as a maintenance headache, is to detect the content type by its extension and then insert the necessary render page into the URL.

So we take the orginal JHTML values with the item parameter

<param name="searchItemURL" value="param:element.URLString">

that comes from the search results bean as it iterates through each item and generates the list of results within

<oparam name="output">

The results used to generate the href for the item result is this (note the backticks for the href):

<a href="`request.getParameter("searchItemURL")`"> <valueof param="element.title"></valueof></a>

And replace that with this snippet of Java code:

<java>
	String searchItemURL = request.getParameter( "element.URLString" );
	if(searchItemURL.endsWith(".xml")) {
		String[] linkStringArr = searchItemURL.split("/open");
		if(linkStringArr.length!=0) {
			searchItemURL = linkStringArr[0]+"/open/render.jhtml?item=/open"+linkStringArr[1];
		}
	}
	if(searchItemURL != null) {
		out.print("<a href=\""+searchItemURL+"\">"+request.getParameter( "element.title" )+"</a>");
	}
</java>

Comments Off