Finding the best way to query news items

The XSLT code for the site is done! Now I am focusing more on different ways the user can search for different news items. As mentioned in a previous post, I’ve been looking at using a multi-level JavaScript drop-down menu, a search box or a combination of the two.

An attempt to present all the countries and cities that appear in the scraped RSS XML makes for a drop-down or scroll-down menu that is way too long to be practical. Also, instead of using a menu for sorting stories by format (text, video, audio, et. al.) I’d rather use the simple icon key explained in that same post. Thus, menus for my project are a no go for now.

Another minor issue I’d like to work out is how to present overlapping file types. For example, I want to be able to present the link to a story that has text with an embedded video as having video and text. So far I’m thinking of using another so-called “code scraper” such as HtmlCleaner, which turns HTML code into plain text. Continue reading

Making it work

Each format may contain one or more of these iconsI’ve been plugging away at the XSL loop for a solid week now. I think I’ve nearly exhausted all that XSLT can do for this project, which I am mainly using to call information from a massive database when certain conditions are met. For example, I was able to make it so that when “Israel” is selected as the location, all of the stories involving Israel will show up with the news item’s headline (wrapped in the story’s permalink), over the publication date, the lede and icons corresponding to the news item’s format (text, audio, video and commentary).

Although I include the publication date as part of each item’s presentation, it will not bear any impact the site’s organizing principle. Each item’s story location(s) will also show up as a dot on a world map. Hovering over either a dot or its corresponding news item’s box brings focus to both the news item and the dot (i.e. by fading out all other story items and dots). Continue reading


I’ve been hard at work on coding the XML framework, specifically making the processor that will generate what the final site will look like, but let me just update on my progress.

Even as an XML newbie, I’ve been able to streamline the workflow I highlighted two posts ago. I’ve opted for a massive, well-constructed XML file instead of a folder structure. The XSLT stylesheet I’m coding now will draw the necessary information from the XML file and place each element in a uniquely labeled div, which I will later style with CSS.

To eliminate any duplicate news items, I’ve turned to the admirable open-source work found at This community has created extensions that help simplify overly complicated XSL commands, in this case reducing a complicated template command to the line set:distinct().

I still intend to make a database of locations and file types, but I think coding a discrete topic database by hand. The more I think about it, the more I want the topic navigation to be user-defined (i.e. with a search bar) rather than solely defined by me, the developer (i.e. with a drop-down menu with a discrete set of terms). Continue reading

Establishing a more usable online news archive

I apologize for getting too technical in my previous posts. I mainly went on about XML, XSLT and other coding languages to outline for myself what I intend to do for the back-end of this project. In my last post, I spoke about the limitations of my method of gathering RSS feeds, sorting them into descriptively named folders and displaying the information in a more organized and aesthetically pleasing way. The limitations are basically that I have no starting point for sorting this information. There is no data bank that I know of where I can draw a bunch of key words from an article’s headline and lede (such as “Nigeria” or “China”) and use those to put in individual folders with those particular names.

My professors assure me that this mechanism is sort of innate to the coding languages I will be using, and that I can create a robust archive with simple conditional (if this, then that) commands. My limitations lie in my programming capabilities. I know I can’t create an automated system simply because I can’t pick up a complex programming language like Python and learn it within a week or even a month. This same principle goes for making a heuristic, or self-teaching, semantic aggregator that makes associations between words such as “West Bank” and “Palestinians.” This kind of programming is for the trained professionals and enthusiasts, not for code newbies like me. Continue reading

Operational difficulties of querying semantic information

Folder Structure and Workflow for Parallactic DriftIn order to implement an efficient system of organizing news items, content providers must label information in a common way within each platform, be it RSS, blogs or web sites. Standards in fact do exist for XML tagging for news sites. Several web consortia exist (including W3C and NewsML) to ensure that a single format is followed, and that information flows freely between publications and reaches more users.

Perhaps it’s because this principle of “free-flowing information” seems in itself counter-intuitive to how traditional publications share news items, but an inconsistent style stifles any RDF standard across different publications. Even if designs remain idiosyncratic, as they should, the semantic tagging of information, in HTML and XML should not deviate too much from an agreed-upon standard. Continue reading