Proposal Critique You’re working for one of the world’s largest real estate brokerages (Douglas Elephant). They’re building a system to identify which homes that are not on the market are the best prospective business opportunities for their agents.
Proposal Critique
You’re working for one of the world’s largest real estate brokerages (Douglas Elephant).
They’re building a system to identify which homes that are not on the market are the best prospective business opportunities for their agents. Real estate agents contact home owners and offer their services, in case the homeowners are interested in selling their homes soon. The services include consulting on how best to prepare and position the home to sell quickly and for a high price, listing the home for sale and marketing it, and helping the client with offers and with the final transaction. In return for these services the agent receives a commission of 2-3% of the sale price. Douglas Elephant’s CTO has specified that their Top-Notch Seller (TNS) recommendation system will steer agents to the homeowners who will make them the most money. The CTO has noted that key to success is that the model also provide some explanation for why a particular home is recommended.
Below is a technical proposal they have received from a vendor for building the TNS system. Assess the proposal and provide constructive criticism: identify what you assess to be the three most important potential flaws and suggest a way to fix each of them. We have already verified that we have access to all the data that the proposal mentions.
Proposal from Pink Flamingo Consulting
The TNS score will be the output of a regression model to estimate how much our agent would make from the sale of each home. The training data will be historical home sales for the past two years. The target variable will be the commission the agent made from that sale. The features will include (a) aspects of the home, such as its size, number of bedrooms, how often it has sold in the past, whether it has a pool, etc., and (b) aspects of the homeowner, such as their age, income, family size, and whether they’ve had an important life event recently (promotion, laid off, engaged, married, divorced, retirement, kid going to college, etc.). We will compare various modeling methods, such as linear regression and regression trees. We will separate the historical data into training and holdout data, and choose the model that has the best area under the ROC curve on the holdout data.
To provide agents with recommendations, we will apply the model to current data, estimate the target variable for each home in the region. Specifically, the TNS system will go through current data on every home in the geographic regions where the company operates and give that home a score. Then the system will recommend to each agent that they reach out to the 20 highest-scoring homeowners in their local area of operation. (They can get more such recommendations, if they exhaust the first 20.)
It is vital that the system accompany each recommendation with an explanation for why it was chosen. To do this, when recommending a home we also will reveal to the agents the two or three features that have the largest coefficients in the model – so, for example, “recently married” and “num bedrooms = 2” (newlyweds often want larger homes in preparation for their families and in-laws’visits).