Where should the MLS end and the IDX begin?

The whole ruckus over the NWMLS no longer sending its member’s listings to realtor.com inspired many unlit pixels of commentary and many more wasted bytes of hard drive space. As I pondered a while ago, the industry appears to have a healthy appetite for technology. However, one of the comments was really insightful….

I still feel that this decision made by the board was wrong. As was the decision last year to disable the client email updates from Locator. We have the technology but are unwilling to use it. I have no love of REALTOR.com but I see no problem with sharing a limited set of data with them and offering our sellers maximum exposure of their listing. In fact, perhaps one of the reasons they discontinued the feed was because as Galen said, “Realtor.com was given the exclusive non-broker feed…” and they were getting pressure from Google and others to get a similar feed. I say give it to them. NWMLS has the ability to provide its members, all of them, with the technology usually reserved only for those with very deep pockets.

The whole thing got me wondering if this just a tactic for the big brokers to keep their upstarts at bay? Because of the MLS system, the big brokers share their listings inventory, with the smaller and independent brokers. However, perhaps the big brokers want the technology out of the MLS, because it harms the smaller upstarts without withholding listings from them?

Maybe there’s a less nefarious motivation. Since the NWMLS board appears to be dominated by members that belong to big brokers, perhaps they don’t want the NWMLS spending its’ limited computing resources (at the end of the day, even the Google’s & Microsoft’s have limited budgets, they just have a few more zeros at the end than most of us do) in areas where a big broker’s IT department or a motivated IDX vendor could do a better job. Regardless of the motivations, it does bring an interesting issue to light.

What should the MLS responsibilities be in terms of listing change notification, statistics/reporting, automated listing distribution, listing access via mobile devices or any number of things that either an MLS or an IDX vendor could provide?

I’m sure the big brokers are less enthusiastic about this type of thing because some have probably already invented these kind of technologies in house years ago (and paid for it out of their own pocket). They probably also see the MLS as competition for viewer eyeballs and would rather the MLS make it easier to combine listings data across their empires instead of being a shared technology provider. MLS regionalization is probably much higher on their MLS IT wish list. After all, the point of an MLS is to share listings data, not share listings technology.

The independent agents and the smaller brokers, probably want the MLS to provide these services, so they don’t have invest any more money in their IDX vendor / IT infrastructure that they don’t have to. I also suspect a lot people in that market segment see technology as an expense and not as an investment. They only want it, if they don’t have to pay for it.

As for me, I’m just an IDX vendor (I don’t have a dog in the fight). From my biased perspective, the less the MLS does, the more valuable my technology becomes, the more useful my services become, and the more opportunities for paying customers I get. I want you to spend money on your IT infrastructure and your IDX vendor! Apparently, the big brokers want you to do the same via their MLS policy direction!

Death by a thousand paper cuts

[photopress:papercut.jpg,thumb,alignright]Every once in a while a realtor or broker from out of state will ask me to develop an IDX web site for them. Unfortunately, supporting a new MLS is very similar to supporting a foreign language. It is a large software engineering task that takes a lot of time, and since I don’t already have the code written and don’t already have access to their MLS’s feed, I inform them that time is money and the conversation usually ends there. Someday, that may not be the case, but I’d rather be small & profitable than large & broke.

The problem is made worse by the fact that many Realtors don’t know what format or protocol their MLS uses for data downloads or even who to contact in their MLS to get a feed for an IDX vendor. If you ever want to change IDX vendors, hire a software engineer or are crazy enough to do it yourself, you should know this. Knowing how your MLS distributes your listing data is like knowing how to change the oil in your car or how to defragment your hard drive. You don’t have to know, but it’s good to know. It may seem like I’m ranting about some MLS techie mumbo jumbo thing again, but it is preventing the industry from taking advantage of the low cost IT innovations that could be. I don’t think folks fully appreciate the challenges that an IDX vendor faces and how those challenges are retarding the industry’s growth and health.

For example, the NWMLS (Northwest Multiple Listing Service – serves mainly Seattle, WA and western Washington) uses software from Rapattoni. It provides listing data via a proprietary SOAP interface and all the photos are accessible via an FTP server. Listing data is updated constantly (a new listing usually appears in our feeds about 15-20 minutes after it’s been entered into NWMLS by a member as I understand it).

By contrast, EBRD (East Bay Regional Data – serves mainly Oakland, CA and the east bay area) uses Paragon by Fidelity MLS Systems provides it’s listing data via nightly updated CSV text files, down-loadable by FTP. The new and updated listings images are accessible via ZIPed files via FTP. The photos for active listings which haven’t been recently added or changed are not available (unless you bug the IT dept).

The only way they could make their systems more different is if the EBRD encoded their listings in EBCDIC! In order to support both, I need to develop 2 very different programs for downloading the listing data onto my server, importing the listing data in my database, dealing with differences in the listing schema (for example, the EBRD doesn’t contain a “Number of Photos” field or a “Community Name” field), dealing with differences in the photo location downloading (the NWMLS stores all photos in an uncompressed format in one of a thousand sub directories while the EBRD just stores the fresh photos in one big zip file). So I can spend my limited time improving my software for new markets (that have no customers) or improving my software for my home market (which has paying customers). Unfortunately, given the current market realities I can only afford to support my home market at this time since MLS IDX programs can be very different and there is no place like home (so far as I know anyway).

I keep waiting for RETS to save me from this madness, but until it happens in Seattle or the East Bay, I’m not holding my breath. After all, if two of the larger MLSes in the country in the two most tech savy areas of the nation don’t support it yet, I interpret it to be a vote of no confidence. I suppose, RETS could be going great guns in the rest of the country, but if it was, I’d expect the NWMLS & EBRD to be all over it, like the establishment on Redfin.

The Center for REALTOR® Technology Web Log, paints a rosy a picture regarding RETS deployment in the industry. Unfortunately, according to Clareity Consulting, an IT consulting firm that serves MLSes and other parts of the real estate eco-system, RETS is the NAR’s unfunded mandate. Although, everybody wants the benefits of RETS, nobody is willing to pay for it. Furthermore, it appears back in days before I got sucked into real estate technology, there was an effort to promote the DxM standard and that went nowhere (which is a bad omen). What’s worse is that they keep moving the goal posts. We don’t even have widespead RETS 1.0 support, and they’ve already depreciated that standard going full bore on RETS Lite and RETS 2.0. It seems the biggest problem is one of vision and scope. They keeping adding more features to cover more scenarios, when we don’t even have wide deployment of the existing standard (assuming that we had standards to begin with at all). It reminds of the recent software industry debacle that is known as “Longhorn reset“. The problem is that RETS is just too complicated, in an environment with too many legacy systems in place, too few resources to support it, and excessive aspirations. The idea of RETS is great, it’s the implementation and deployment that’s disappointing and at least Microsoft pulled Vista out if it’s death spiral…

[photopress:pappercutter.jpg,thumb,alignleft]The sad thing is that computer industry already has great tools for moving data around over the Internet in efficient and well supported (if sometimes proprietary ways). They allow you to query, slice, and dice your data in a near infinite number of ways. They’re called database servers. They are made by multiple software vendors and there are even some excellent open source ones out there. They let you set permissions on what accounts can see what tables or views (gee, sounds like something an MLS would want). The better ones, even have this level of security to the field level. Even better, most of these so called database servers have the ability of exporting data into spreadsheets, reporting tools, and even GIS systems. All of them provide a well defined and often times well implemented API that software developers can use and exploit to implement what hasn’t been invented yet!

Why doesn’t the NAR & the MLSes save us all the trouble, standardize on a few good database platforms (I’m a fan of MS SQL Server and MySQL, but I’d settle for anything that has ODBC, .net & Java support at this point), and provide everybody RDBMS accounts? It’d lower the cost for us IDX vendors (less code to write, since everything is just SQL), it’d lower the costs for MLS vendors (since data access, security, programmability, and scalability is now the RDBMS vendor’s problem), provide more choices for agents and brokers (since getting Excel talking to MS SQL Server is a cakewalk compared to RETS) and it will lower IT costs for the MLS (because the MLS vendors don’t need to invent an industry specific solution to a problem that’s been largely solved already and I’m betting that the MLS vendors already use somebody else’s RDBMS to implement their solutions anyway). Granted, a SQL Server won’t enable all the scenarios that RETS wants to enable (if RETS was ever well implemented and widely deployed enough for that happen). However, I’m of the belief that it’s not going to happen until after Trulia or Google Base becomes the de facto nationwide MLS by providing a single schema with a simple REST like web services interface.

So, what does your MLS do to support IDX vendors? Do they provide all the data all the time, or just daily updates? Have they deployed RETS yet? Are they going to? Who is their MLS software vendor or do they have a home gown solution? What do you want to do, that you can’t do today because the data is in a format that you can’t use easily? Would you be willing to pay more in membership dues for better software or better service from your MLS? Are we at the dawning of the RETS revolution, or is it too little, too late?

PS – Anybody, know anybody from an MLS / IDX dept or MLS vendor that blogs? I’d love to know what things are really like on their side of the listing data fence.

"I am Tiger Woods"

tiger-woodsWhen I was at Inman, I believe it was Dottie Herman (although I realize that Altos Research attributes the quote to Burke Smith) who said “Technology won’t replace agents, agents with technology will replace agents“. Regardless of the source, it’s a great quote! That remark struck a chord with me. Except there’s one small problem… There’s not enough “real” technology vendors out there! Let me explain further…

OK, at one end of the Real Estate 2.0 spectrum, you have Zillow, Move, & Trulia. They use cool technology to sell advertising in the Real Estate market. Nothing wrong with that. Being a ‘softie alumni perhaps I’m a bit too set in my ways to fully appreciated the size of the opportunity these fine companies are going after. After all, MS only has a 10% share of the $500 billion/year enterprise IT market. But Google, probably only has a 1% share of the $3 TRILLION/year advertising market. Maybe those numbers are off, but it feels like a good Zestimate to me. Clearly there’s a lot money to made from the death of print media and these guys are at grave yard with their shovels ready. More power to them I say.

At the other end of the Real Estate 2.0 spectrum, you have HouseValues, HomeGain & others. They try to use technology to lure in and sell leads. It’s not my cup of tea, and some people don’t like them, but there’s nothing wrong with that business model either.

But where are the companies that use technology to just sell technology? When I look at the MLS search offerings of my future competitors like Birdview, Wolfnet, Superlative, Logical Dog, and literally a cast of thousands etc, I just cry and smile. The maps are non-existent or very Web 1.0-ish. RSS or KML? What’s that? Foreign langauge support? Is English considered a foreign language yet? Data Visualization? You gotta be joking. Page speed? Maybe if you measure performance with a calendar. And I haven’t even talked about half of the things I want to see or invent in a world class MLS search tool.

Granted, my game still needs a lot more work as well. Zearch is English only still, there’s more to data visualization than pretty Zillow charts, I really have no idea of how bad I scale yet (better than reply.com I hope, otherwise I know I’ll never hear the end of it), and I only support the NWMLS right now, but on the whole I’m feeling pretty optimistic about my chances on the pro tour.

Picture this scenario, here I am, John Q Homebuyer, getting my Zillow fix, Moving around the web, and being Trulia impressed with all this Real Estate 2.0 stuff, and then I click on your ad. I goto your web site, I wanna search for homes (because frankly that’s why people visit your site, unless you’re a famous blogger) and do you know what happens next? It’s reminds me of the guy going for a test drive in the new Volkswagen radio ads.

“This broker’s web site has 3 speeds, and this one is disappoinment. Web site, honk if you suck *honk* Take me to a RealTech or Caffeinated web site”

FYI – I’m leaving out RedFin because there are an exception to this generalization. They are a broker that has developed great technology in house and they are keeping it all to themselves (punks ;)). So most other brokers can’t really compete with them technologically speaking unless they partner with a technology vendor (like RealTech or myself).

I mean, we have all these “consumer portal” companies doing interesting work, empowering consumers, and then when I visit the broker’s or agent’s web site for the full story, it’s a total and complete let down. There’s over 1 million agents in the this country, and probably only a 1000 agents that have web sites worth visiting (I suspect half of which are regular Rain City Guide readers), and hardly anyone with compelling MLS search tools. It feels like all the good software engineers involved with this industry want to sell it an ad or a lead, instead of a killer web site. Maybe the industry needed a few well funded and very talented start-ups to smack it around to finally wake up and smell the software? (sniff, aaaahh, Firefox fresh scent, yummy)

Clearly there’s a big opportunity for developing a good MLS search tool for this industry. Maybe not Zillow, Microsoft or Google sized, but it’s big enough to make me interested in going for it. I’m pretty excited at the thought of all the possibilities, personally.

This is why I cry and smile. I cry because I feel my clients pain. They just want a cool web site to capture leads so they can get off the advertising & lead buying treadmill, and finding a good one is just about impossible. I cry because I feel the home buyers pain. This stuff should be so much better than it is. Many brokers have the money and are willing to do something about it, but it just looks like the current set of vendors serving them are developing products like it’s web circa 1999. I smile, because I’m in a position to do something about this. I feel like a Tiger. Here’s how I break things down on the links…

Jack Nicklaus is still winning most of the major tournaments these days, but he’s the one whose records I wish to break. RealTech has done some real nice work w/ John L Scott, CB Bain (did you guys do CatalistHomes? It looks like your work, but I don’t see your brand anywhere?). He has a few years of a head start over me, and is probably in process of making his other clients very happy. I hope that Zearch will eventually be as well regarded as the work you’ve done.

But after Jack, I can’t see anybody else out on the course improving their game. Maybe they are all down at the club house sipping some buds? Maybe they think the mine field of MLS downloading rules and methods will keep their market shares safe from technology disruptors (and to be honest, they are partly right – I wouldn’t be crazy enough to take this on if I wasn’t so convinced that I could build a much better set of web tools than most of the vendors I’ve discovered). Maybe they’ve never read Andy Grove’s “Only the Paranoid Survive“? But if this industry embraces RETS (or better yet, screw the SOAP and let me get dirty w/ the MLS’s SQL Servers), I suspect a few names on the MLS/IDX web site industry leaderboard will change.

But how can any vendor support all 900+ MLSes in this country! This is a monster challenge, even for a Tiger. We’re talking a 600 yard, Par 3 sized challenge here folks. Sorry, but even Tiger’s Nike golf equipment can’t par that hole. I’ll suspect I’ll just refine my game on the local links until I get really good. (If Dustin would only give me the connection string to Realtor.com’s SQL cluster it would all be so much easier. ;)) Oh well, if I gotta play the game one hole at a time, that’s the way I gotta play. Just keep making pars, make a birdie here or there, no bogeys, and watch the other players fall apart like a Sunday afternoon during a major. I dunno, but it’s starting to feel like the 2nd round of the 2000 US Open at Pebble Beach to me.

I’m working out, I’m going to a swing coach, I’m sinking my putts, I’m killing balls on the driving range, and more imortantly, I’m feeling a little faster, stronger, & smarter with each passing week. So all you other players, better step up your game. Tiger’s turning pro soon. Maybe not this year, maybe not next, but soon. And when he does, the game of real estate will not be the same.

Except for Jack, I wouldn’t worry about him too much. We’ve all seen the green jackets in his closet. 🙂

I know Ubertor’s got game, but I consider them more of a Michael Jordon type player. Great stuff, but he plays a different sport than we do. So Mr. Agent & Ms. Broker, are there any good MLS/IDX vendors out there whose game impresses you? (Other than Jack’s & Tiger’s of course?)

There you go again – the MLS doesn’t scale

[photopress:Reagan.jpg,thumb,alignright]Ever since Zearch, I’ve been bombarded with work to update or create MLS search web sites for various brokers & agents across the country. Because of this, I’ve had the opportunity to deal with another MLS in the Bay Area (EBRDI) and Central Virginia (CAARMLS). Before I begin another MLS rant (and cause the ghost of Gipper to quip one of his more famous lines), I want to say the IT staff at both EBRDI & the NWMLS have been helpful whenever I’ve had issues, and this primary purpose of the post is to shine a light on the IT challenges that an MLS has (and the hoops that application engineers have to jump through to address them).

After working with EBRDI, and the NWMLS, I can safely say the industry faces some interesting technical challenges ahead. Both MLSes have major bandwidth issues and the download times of data from their servers can be so slow, it makes me wonder if they using Atari 830 Acoustic modems instead of network cards.

The EBRDI provides data to members via ftp downloads. The provide a zip file of text files for the all listing data (which appears to be updated twice daily), and a separate file for all the images for that day’s listings (updated nightly). You can request a DVD-R of all the images to get started, but there is no online mechanism to get all older images. This system is frustrating because if you miss a day’s worth of image downloads, there’s no way to recover other than bothering the EBRDI’s IT staff. If the zip file gets corrupted or otherwise terminated during download, you get to download the multi-megabyte monstrosity again (killing any benefit that zipping the data might have had). Furthermore, zip file compression of images offers no major benefit. The 2-3% size savings is offset by the inconvenience of dealing with large files. The nightly data file averages about 5MB (big but manageable), but the nightly image file averages about 130 MB (a bit big for my liking considering the bandwidth constraints that the EBRDI is operating under).

As much as I complain about the NWMLS, I have to admit they probably have the toughest information distribution challenge. The NWMLS is probably the busiest MLS in the country (and probably one of the largest as well). According to Alexa.com, their servers get more traffic than redfin or John L Scott. If that wasn’t load enough, the NWMLS is the only MLS that I’m aware of that offers sold listing data [link removed]. If that wasn’t load enough, they offer access to live MLS data (via a SOAP based web service) instead of daily downloads that the EBRDI & CAARMLS offer their members. If that wasn’t enough load, I believe they allow up 16 or 20 photos per active listing (which seems to be more than the typical MLS supports). So, you have a database with over 30,000 active listings & 300,000 sold listings, all being consumed by over 1,000 offices and 15,000 agents (and their vendors or consultants). The NWMLS also uses F5 Network’s BigIP products, so they are obviously attempting to address the challenges of their overloaded information infrastructure. Unfortunately, by all appearances it doesn’t seem to be enough to handle the load that brokers & their application engineers are creating.

Interestingly, the other MLS I’ve had the opportunity to deal with (the CAARMLS in Central Virginia) doesn’t appear to have a bandwidth problem. It stores it’s data in a manner similar to EBRDI does. However, it’s a small MLS (only 2400-ish residental listings) and I suspect the reason it doesn’t have bandwidth problem is because of the fact it has fewer members to support and less data to distribute than the larger MLSes do. Either that, or the larger MLSes have seriously under invested in technology infrastructure.

So what can be done to help out the large MLSes with their bandwidth woes? Here’s some wild ideas…

Provide data via DB servers. The problem is that as an application developer, you only really want the differences between your copy of the data and the MLS data. Unforunately, providing a copy of the entire database every day is not the most efficient way of doing this. I think the NWMLS has the right idea with what is essentially SOAP front end for their listing database. Unfortunately, writing code to talk SOAP, do a data compare and download is a much bigger pain than writing a SQL stored proc to do the same thing or using a product like RedGate’s SQLCompare. Furthermore, SOAP is a lot more verbose than the proprietary protocols database servers use to talk to each other. Setting up security might be tricky, but modern DB servers allow you to have view, table, and column permissions so I suspect that’s not a major problem. Perhaps a bigger problem is that every app developer probably uses a different back-end, and getting heterogeneous SQL servers talking to each other is probably as big a headache as SOAP is. Maybe using REST instead of SOAP, would accomplish the same result?

Provide images as individually down-loadable files (preferably over HTTP). I think HTTP would scale better than FTP would for many reasons. HTTP is a less chatty protocol than FTP is, so there’s a lot less back & forth data exchange between the client & server. Also there’s a lot more tech industry investment in the ongoing Apache & IIS web server war than improving ftp servers (I don’t see that changing anytime soon).

Another advantage is that most modern web development frameworks have a means of easily making HTTP requests and generating dynamic images at run time. These features mean a web application could create a custom image page that downloads the image file on the fly at run-time from the MLS server and caches it on the file system when it’s first requested. Then all subsequent image requests would be fast since they are locally accessed and more importantly, the app would only download images for properties that were searched for. Since nearly all searches are restricted somehow (show all homes in Redmond under $800K, show all homes with at least 3 bedrooms, etc), and paged (show only 10, 20, etc. listings at a time), an app developer’s/broker’s servers wouldn’t download images from the MLS that nobody was looking at.

Data push instead of pull. Maybe instead of all the brokers constantly bombarding the MLS servers, maybe the MLS could upload data to broker servers at predefined intervals and in random order. This would prevent certain brokers from being bandwidth hogs, and perhaps it might encourage brokers to share MLS data with each other (easing the MLS bandwidth crunch) and leading to my next idea.

BitTorrents? To quote a popular BitTorrent FAQ – “BitTorrent is a protocol designed for transferring files. It is peer-to-peer in nature, as users connect to each
other directly to send and receive portions of the file. However, there is a central server (called a tracker) which coordinates the action of all such peers. The tracker only manages connections, it does not have any knowledge of the contents of the files being distributed, and therefore a large number of users can be supported with relatively limited tracker bandwidth. The key philosophy of BitTorrent is that users should upload (transmit outbound) at the same time they are downloading (receiving inbound.) In this manner, network bandwidth is utilized as efficiently as possible. BitTorrent is designed to work better as the number of people interested in a certain file increases, in contrast to other file transfer protocols.”

Obviously MLS download usage patterns match this pattern of downloading. The trick would be getting brokers to agree to it and doing it in a way that’s secure enough to prevent unauthorized people from getting at it. At any rate, the current way of distributing data doesn’t scale. As the public and industry’s appetite for web access to MLS data grows and as MLSs across the country merge and consolidate, this problem is only going to get worse. If you ran a large MLS, what would you try (other than writing big checks for more hardware)?