I thought I’d take a moment to reflect on how Rain City’s favorite MLS Search is implemented. I’m a little tired of thinking in computer languages (mostly T-SQL, C# and Javascript), so I figured I’d blog a bit in Geek English for a little while before I hit the compiler again.
[photopress:matrix1_alt.jpg,full,alignright]
I’m always interesed in how web sites & computer software works under the covers, so I thought I share some of the more interesting points about how I’ve implemented “Zearch” to date for the “geekier” folks in the blogosphere.
It all began way back in the fall of 2005 shortly after I got my first MLS feed. At the time, Microsoft’s asp.net 2.0 platform was still in beta. However, after learning what Microsoft’s next generation web development tools were going to do (and seeing what Google Maps and Microsoft’s Virtual Earth teams were doing), I saw a great unrealized potential in MLS search tools and decided to do something about it.
Anyway, it’s all built on top of asp.net 2.0 and MS SQL Server 2000 (yeah, I know I’m old school). One of the first things I did is combined all the property types into a VIEW and create a dynamic SQL query when you search for properties. Some search tools only let you search for residential properties or condominums at one time (which I thought was lame). I orginally tried to implement stuff doing a bunch of UNIONs, but keeping track of the schema variations for the different property types eventually drove me nuts, and I encapsulate all that crud into a VIEW.
I also find it a little ironic, that I’m not the only one who found the MLS schema differences a PITA to deal with. I’m glad the various MLS software vendors and the CRT are working toward a common industry schema (aka RETS), so us application developers can focus on the real problem (developing compelling & useful software), instead of remembering that the ld column in one table, is really the list_date column in another table.
Another interesting thing I do on the back end is that I geocode every listing after I do data download. The main reason is that I don’t trust the MLS data and their bogus geo-coding would make my app look bad. I also knew when I started, I’d eventually do maps, so as soon as a new listing hits my database, it’s gets more accurately/correctly geo-coded. In case your wondering if I’m screen scraping w/ Perl or something else, it’s all done with T-SQL stored procdures. (Well, technically it’s a proc that calls the MSXML2.ServerXMLHTTP COM object, to issue an HTTP request against a geocoding web service, and then uses OPENXML on the response’s XML to get the latitude & longitude).
As you might have guessed, there are also stored procedures and functions to get the distances between two points, doing a radius search, and other stuff of that ilk. Fortunately, all that stuff can easily be found using your favorite search engine, so you don’t need to know how all the math in the law of cosines works (you just need to know of it).
Well that’s it for the back end. Next time I’ll talk about the front end put on my Web Developer hat.
Did you know:
- Google’s stock is a bubble. Seattle area real estate may or may not be.
- Zillow uses an open source front end but is using MS SQL 2005 on the back end.
- HouseValues is a Microsoft shop and RedFin isn’t.
- The server that runs our home search is a home built machine running Windows Server 2003 co-located at IsoMedia in Redmond.
- The server that runs Rain City Guide blog is at Inmotion hosting in California (running WordPress on a LAMP stack).
- MySpace is now using asp.net and handles about 1.5 billion page views a day. (2nd only to Yahoo)
- Via Virtual Earth (a blog for Virtual Earth developers) has made Rain City Search a Site Gallery pick. You know your doing cool stuff when other geeks like it. 🙂
The Site Gallery Pick is awesome! Congrats Robbie!
Something I’d like to see in a Real Estate search is a commute cost calculator. In other words before you start searching you would give the address of your workplace, the mileage of your vehicle and the cost of your time. Then every time you viewed a house it would calculate the distance between that house and your workplace and give you the cost of the commute in miles (gas) and time. It could then add this to the house cost to give you the true cost of your home with commuting. This might give the Seattlites buying up property here in Tacoma more pause…
Of course to do this you would need a web service to give you driving miles between two addresses, which none of the mainstream free map services do AFAIK. So it might require licensing a real (non-free) map service which is expen$$ive.
Eric,
With my background as a transportation planner, this idea has some special interest for me. Our regional government already has detailed commute data for today (and future years!):
But good luck getting them to give it up for public consumption!
Eric,
With my background as a transportation planner, this idea has some special interest for me. Our regional government already has detailed commute data for today (and future years!):
But good luck getting them to give it up for public consumption!
Eric,
You may be pleased to learn that displaying the distance between property A and point B, as the bird flies, is on my short list of features to add this year. I’m still undecided if I want user entered points of interest (and other data) to be private (for your eyes only), protected (you & the agent), public/anonymous (you and anybody else who’s interested) or let the user decide. I can see arguments for all sides, so any suggestions or thoughts on the visibility and privacy of user data would be welcome.
I also need to do some benchmarking/stress testing, so I can find out what’s the fastest way to do radius searches since my server is going to be doing a lot more of them as our search tool gets more popular and more interactive.
I suspect the best approach for fast radius searches is to cache all the trig calculations you can in the database and have a native code or CLR-based stored procedures that does all the math you need to do (I don’t believe that trigonometry calculations are T-SQLs strong suit). I’m sure getting a 64-bit server with a ton of RAM also helps. If any software / database engineers from Zillow (or any place else) want to share their secrets for faster radius searches, I’m happy to listen. 🙂
As you mentioned, displaying driving distance & directions is something that requires a real (non-free) map service. And as Dustin mentioned, getting predicted driving times requies being friends with transportation engineers. I’m leaning toward getting a Mappoint Web Service subscription at some point, but if anybody knows of anything similar or cheaper let me know. I currently come dangerously close to my Yahoo web service limits every day right now…
Hmmm.. Fast radius calculations maybe if you projected each house around a center point (in batch) you could speed things up. Then each house’s coordinate would no longer be latitude and longitude but could be meters from a center point (the Space Needle?). At that point you could just use the distance formula everyone knows to calculate distances (sqrt((x1-x2)^2+(y2-y1)^2))- no more of that complex trig you need when doing Latitude and Longitude distance calculations.
You could optimize speed further by placing houses in “zones” on the map. When doing radius searches you could narrow down how many zones to search by calculating which zones the radius could reach into. So in other words if I’m doing a 10-mile search in Zone A I don’t need to search for houses in Zone Z that’s 50 miles away!
Hmmm.. Fast radius calculations maybe if you projected each house around a center point (in batch) you could speed things up. Then each house’s coordinate would no longer be latitude and longitude but could be meters from a center point (the Space Needle?). At that point you could just use the distance formula everyone knows to calculate distances (sqrt((x1-x2)^2+(y2-y1)^2))- no more of that complex trig you need when doing Latitude and Longitude distance calculations.
You could optimize speed further by placing houses in “zones” on the map. When doing radius searches you could narrow down how many zones to search by calculating which zones the radius could reach into. So in other words if I’m doing a 10-mile search in Zone A I don’t need to search for houses in Zone Z that’s 50 miles away!
Eric – clever, but you’re reinventing the wheel! You can also use a better coordinate system for spatial search. UTM NAD83 typically has units in meters and covers all of Western Washington State. A squared plus B squared equals C squared.
Using the Pythagorean theorem to calculate the distance has merit. Since the search radii are usually pretty small and the curvature of the earth should have little bearing on the distance in the cases I care about (at least it should the continental US). It’s much faster than trig, and probably accurate enough for what I’m doing.
My biggest concern with that approach is that I don’t want search results to appear as ovals (or figure out the math, to correct it so it looks circular), since the meters per degree longitude(?) varies as you move up and down the earth. Then again, since Virtual Earth uses a Mercator projection, I already have this problem now anyway. Hmmm….
BTW, I currently already eliminate all the houses more than a couple tenths of a degree latitude & longitude from the center point from my search (which led to a pretty big speed boost) before I even do the trig.
Yes, the results will appear to be ovals. The question then becomes are you interested in accuracy or preceived accuracy. I think if you don’t show the circle, no one will notice.
You have hit on a core issue of RE search tool design Galen…
Accuracy or perceived accuracy.
Keep up the good work Robbie, believe me we are watching!
Ps. Dustin, an interesting thing I have noticed by displaying my email address instead of my name on comments, I am receiving several junk mail items that drop RCG in the body of the emails.
I guess they are trolling for email addresses on these more popular blogs??? I wonder if there is a plugin to stop this. I will try to look for one.
You have hit on a core issue of RE search tool design Galen…
Accuracy or perceived accuracy.
Keep up the good work Robbie, believe me we are watching!
Ps. Dustin, an interesting thing I have noticed by displaying my email address instead of my name on comments, I am receiving several junk mail items that drop RCG in the body of the emails.
I guess they are trolling for email addresses on these more popular blogs??? I wonder if there is a plugin to stop this. I will try to look for one.
The key is to do as many intermediate calculations up front and store them for future use. Take a look at your calculations using trig functions and see which intermediate calcs are not dependent on the Point B location. Store those for future use.
We’ve been doing radius searches for about 15 years now, so we’ve refined our code a bit over that time. Well actually, it was all refined pretty much up front because computers were so slow, we were forced to watch the cycles.
Of course, the best way is use the intelligent routing algorithms found in mapping apps which look at the actual route. Have fun building enough capacity to handle a several thousand simultaneous users doing that!
“Best”
Bert
I wonder if that’s still the case today though? With MS SQL 2005, you can have procs written in .net or native code. (I believe Crazy Larry’s database has similar capabilities, but I’m talking about MS SQL, cause that’s what I know best). Also x86 CPUs have had floating point processors built in for the past 10+ years. So floating point math on a PC can be pretty quick these days, if you let the hardware do it.
Most of the intermediate calculation methods I’ve seen take a lat & long and cache x, y, & z coordinates. So you’ve now increased the amount storage you need for location information, which will slow things down a bit. However, historically speaking SQL look-up is “fast” and calculation is “slow”, so this approach makes sense.
But with .net or native code procs, SQL calculation is now “fast”, it now could be faster to calculate it on the fly, than it is to store & retrieve it. Then again, it might not be.
I’m still using SQL 2000, so the discussion is kind of moot at the moment now anyway (I should be caching the intermediate calculations). But when I move to SQL 2005, I’ll definitely be comparing both approaches to see if the old rules are still the best ones.
I wonder if that’s still the case today though? With MS SQL 2005, you can have procs written in .net or native code. (I believe Crazy Larry’s database has similar capabilities, but I’m talking about MS SQL, cause that’s what I know best). Also x86 CPUs have had floating point processors built in for the past 10+ years. So floating point math on a PC can be pretty quick these days, if you let the hardware do it.
Most of the intermediate calculation methods I’ve seen take a lat & long and cache x, y, & z coordinates. So you’ve now increased the amount storage you need for location information, which will slow things down a bit. However, historically speaking SQL look-up is “fast” and calculation is “slow”, so this approach makes sense.
But with .net or native code procs, SQL calculation is now “fast”, it now could be faster to calculate it on the fly, than it is to store & retrieve it. Then again, it might not be.
I’m still using SQL 2000, so the discussion is kind of moot at the moment now anyway (I should be caching the intermediate calculations). But when I move to SQL 2005, I’ll definitely be comparing both approaches to see if the old rules are still the best ones.
BTW, after doing some more research, I’ve learned that simple calculations are best left in SQL, since invoking a native code or CLR proc has a fixed overhead. So, if you want to convert Fahrenheit to Celsius, do it in SQL since it’s much faster, if want to do radius searches, it’s a virtual tie and if you want to do something more CPU intensive, use a CLR or native code based proc. I didn’t know where that line of diminishing returns was.
Source: Processor.com
Thanks for the discussion and insights!
Robbie,
Your search got a really nice mentioned on a SQL Server Podcast!
Robbie,
Your search got a really nice mentioned on a SQL Server Podcast!
I was recently granted access to a NWMLS feed and now have to create a search feature for a website.
Is there any resources out ther that you might have that could help me in this process? I am using ASP.NET 2.0.
If I could get a hold of something that does a simple search, that would be great and would save me a TON of time – otherwise I’m stuck here programming it on my own; which, could get very interesting.
Any help is appreciated.
-Ron
I was recently granted access to a NWMLS feed and now have to create a search feature for a website.
Is there any resources out ther that you might have that could help me in this process? I am using ASP.NET 2.0.
If I could get a hold of something that does a simple search, that would be great and would save me a TON of time – otherwise I’m stuck here programming it on my own; which, could get very interesting.
Any help is appreciated.
-Ron
Aside from being granted access to a feed, I’ve always felt the hardest part (perhaps frustrating is a better word) is getting the data into a useful form to begin with. If your crazy enough to develop an NWMLS search tool from scratch, I’d recommend the following…
Join the NWMLS dev forum at http://groups.yahoo.com/group/NWMLS-EverNet.
I’d recommend you take advantage of the new SqlDataSource & GridView controls in ASP.net 2.0.
I’d recommend you learn about dynamic image pages in ASP.net.
Since I’m attempting to sell the search tool I created on Rain City, I don’t want to be too helpful but hopefully the following suggestions will give you a better idea what your in for.
Aside from being granted access to a feed, I’ve always felt the hardest part (perhaps frustrating is a better word) is getting the data into a useful form to begin with. If your crazy enough to develop an NWMLS search tool from scratch, I’d recommend the following…
Join the NWMLS dev forum at http://groups.yahoo.com/group/NWMLS-EverNet.
I’d recommend you take advantage of the new SqlDataSource & GridView controls in ASP.net 2.0.
I’d recommend you learn about dynamic image pages in ASP.net.
Since I’m attempting to sell the search tool I created on Rain City, I don’t want to be too helpful but hopefully the following suggestions will give you a better idea what your in for.
Ron,
Are you trying to use the nwmls feed as your DB or are you asking for help on how to download and store they data so you can search using your own DB?
Robbie,
Thank you for the YahooGroup – I’m sure I’ll find a lot of useful information there. I understand that you cannot help me much further than you already have but it was worth a shot.
Allen:
I already have access to the feed and can make queries to it without a problem. I will not be storing any of the data in my own db, only making query requests to provided server.
As Robbie said above, I feel the same way about putting the information into a useful form. I am new to ASP.NET but have a background in PHP and Visual Basic.
I’m using SqlDataSource & GridView – I guess I’ll be squeezing out everything my brain has!
Thanks guys,
Ron
Robbie,
Thank you for the YahooGroup – I’m sure I’ll find a lot of useful information there. I understand that you cannot help me much further than you already have but it was worth a shot.
Allen:
I already have access to the feed and can make queries to it without a problem. I will not be storing any of the data in my own db, only making query requests to provided server.
As Robbie said above, I feel the same way about putting the information into a useful form. I am new to ASP.NET but have a background in PHP and Visual Basic.
I’m using SqlDataSource & GridView – I guess I’ll be squeezing out everything my brain has!
Thanks guys,
Ron
Ron,
Since you have a background in VB & PHP, I suspect the .net class library is your biggest hurdle. Doing things in VB.net instead of C# may make things a little easier for you given your background. Also, there’s no law saying you can’t implement your site in PHP if your more comfortable with that toolset.
If you’re just getting started with ASP.net, I’d recommend playing with the ASP.net 1.x tutorials on Dot Net Junkies and the ASP.net 2.0 tutorials on asp.net.
Robbie
Â
Pingback: Seattle’s Rain City Real Estate Guide » Improving Online Home Valuations?
i want to show one map of a state in picturebox. on that map on particular locations i want to show percent rainfall from database ,continuously when i will load my form. i want to do this application in vb.net could u plz help me?