Out of seven billion inhabitants on Earth, approximately 75% [1] do not have proper addresses that allow their houses, properties or businesses to be located on a map with reasonable precision. To illustrate, consider the two addresses of one of our authors in two different continents:

  Figure 1: Every place deserves an address. Seen here is the city of Jaipur in India. Photo Credit: Debabrata Majumdar

Figure 1: Every place deserves an address. Seen here is the city of Jaipur in India. Photo Credit: Debabrata Majumdar

  • [Name], 7116 Via Correto Dr, Austin, TX 78749: the location for this address is readily available and any navigation system can take you to the doorstep in day or night; or in good or bad weather! This is an example of a structured address.
  • [Name], College Tilla, PO Agartala College, Agartala 799004: Google Maps resolves this address to an area of roughly 76 sq. km. in the city of Agartala. This address is unstructured.
  • Adding a landmark to the above address, “College Tilla near College Tilla Lake, ..” narrows the answer to an area of approximately 3 sq. km. Upon reaching the place, and depending on how much detailed information about the occupant or the house or the location is known (e.g. whose son/ brother the occupant is, what’s his age, color of his house and if is located by the old mango tree etc.), one would take a additional 15-45 minutes, provided it’s not late night or raining. Moreover, landmark-based addressing is infrequent, incomplete and also inconsistent.
  Figure 2: Most addresses in the developed world resolves within a narrow area, where those in India can resolve anywhere from town to a locality to (rarely) at a house level

Figure 2: Most addresses in the developed world resolves within a narrow area, where those in India can resolve anywhere from town to a locality to (rarely) at a house level

These are not merely one-off experiences that cause inconvenience. The inability to provide an accurate location for each and every address, impacts the livelihood of residents in many ways. It inhibits growth of their local trades such as salons, bakeries or food stalls. It reduces availability of amenities such as creation of bank accounts and delivery of goods and services (e.g., e-commerce) and delays emergency services such as fire brigades and ambulances.

Economic Impact on Industry

Case Study 1: Logistics and Transportation

The inability to locate an address within a reasonable accuracy, as demonstrated in the previous example, hampers a transporter’s ability to deliver shipments on time and without incurring additional costs. Consider the e-commerce industry, or any industry where goods are delivered to an address. In absence of proper geocodes, most companies in India, “sort” the packages or goods by pincode, since pincode is the only numeric location depictor that appears in most written addresses in India.

While this sounds like a logical solution, e.g., to have deliveries sorted by pincodes and even creating facilities (that store the shipments for delivery or return) according to the demand volumes those exist in respective pincodes. However, two major, practical problems arise while implementing this idea:

  • About 30-40% [2] of the pincodes in India are written incorrectly, leading to shipments being misrouted and requiring manual intervention for eventual routing to the correct pincode
  • The average area a pincode covers 179 sq. km. with about 135,000 households and over 100,000 business, educational institutions, government buildings etc. Sorting deliverables for more than a quarter million addresses, where only 30% of those addresses are structured, poses many challenges
  • We will pick an area within the Logistics and Transportation industry to illustrate the challenge in detail. Consider an e-commerce, which is growing at a 30 CAGR [3], driven by rising income, consumption and digitization [4], and it is expected to continue to do so for the foreseeable future. Indian consumers expect products to be delivered at their doorsteps for free, which causes a unique burden on the e-commerce companies not seen in most parts of the world.

When a product from an e-commerce site is ordered online, the merchandize is picked up from a seller or a warehouse and brought to a processing, where it is sorted for the destination city- a process known as the “first mile operation”. It is then transported between origin and destination cities in a “line haul” that involves long-distance transportation such as truck, air etc. In the “last mile operation”, the merchandize goes to the delivery center from where it is delivered to the shopper’s house. Figure 3 illustrates this process.

  Figure 3: The “last mile cost” in India is ~30% of the total cost of delivery

Figure 3: The “last mile cost” in India is ~30% of the total cost of delivery

In western countries, structured addresses lead to a relatively accurate geocoding and consequently the last mile cost is about 10-12% of the total cost [5]. In India, the same cost is ~30% of the total cost of delivery; notwithstanding India’s low cost of labor. The extra cost comes from the longer time that a driver takes in locating an address- to drive to an address, stopping multiple times to either call the recipient or ask someone on the road or nearby shops for location and directions, time wasted on additional kilometers driven in search of the address etc.

Figure 4. Illustrates the challenges of delivering to addresses that cannot be disambiguated at the house level. In absence of a lat-long for the desired address, addresses are sorted based on pincodes and all the packages for one pincode are sorted, stored and delivered from one (or, sometimes, two or more) delivery center(s). Typical pincode-based sorting centers, located at the orange pin location can have a delivery “throw” (radius) of 4-20 km. In absence of understanding load distribution based on addresses, such centers’ locations often tend to be imbalanced. For example, in this case, while one delivery biker drives about 22 km, the other drives about 104 km, almost 5 times the distance covered by the first one. Such problems significantly hamper the initiatives to improve productivity and reduce costs at these centers.

 

  Figure 4: The difference between a pincode-based sorting and local-address-based sorting. In this picture, the light orange area shows a typical pincode boundary. In this delivery center productivity can vary widely as optimal route planning becomes complex, as depicted by two delivery bikers’ routes

Figure 4: The difference between a pincode-based sorting and local-address-based sorting. In this picture, the light orange area shows a typical pincode boundary. In this delivery center productivity can vary widely as optimal route planning becomes complex, as depicted by two delivery bikers’ routes

If the addresses, on the other hand, can be disambiguated down to a household level, then each individual locality can have a small delivery center, much like it is common to have pharmacy, grocery store or a restaurants in most areas. In such cases, the “throw” of the delivery center goes down to an average of 1-3 km. The bikers typically cover small distances in a shift and multiple shifts can be run in a day to match the schedule of multiple trucks that can bring loads throughout the day from the destination hubs. In Figure 4, the locations for such small delivery centers are represented by red pins and areas they cover are shown in yellow. This results in a significant improvement in productivity at the smaller centers that can do address-based sorting.

Moreover, more granularity in the address geocodes, also allows us to perform route optimisation and provide system driven routes for the delivery boys. We provide the case of Delhivery, one of India’s leading logistics providers for e-commerce companies.

At Delhivery, a switch from a pincode based to a locality based sorting has improved the productivity of the last mile operation by 40-60%, depending on the type/complexity of the addresses and size/shape of the locality.

  Table 1: An address-based sorting can result in a 40-60% better productivity

Table 1: An address-based sorting can result in a 40-60% better productivity

This high last-mile cost disproportionately affects a company’s bottom-line. In a simplistic analysis in Table 2, we demonstrate that a better geocoding which reduces the last-mile cost by 40%, well within the reach of current technology, can improve the profitability of an e-commerce company.

  Table 2: Illustration: An improvement in the last mile cost can swing the profitability of an e-commerce business

Table 2: Illustration: An improvement in the last mile cost can swing the profitability of an e-commerce business

Even for a small industry in India, such as e-commerce delivery, which is estimated to be a 5,000 Crore (~$775M) business annually (as of 2017), the annual cost savings from a better addressing scheme is about 650 crore (~$100M).

For the Logistics and transportation industry, the same framework can be used for different types of goods and services transportations. When a shipment is moved from one city to another, the line-haul segment, which is typically the inter-city transport, is not much impacted by the lack of proper geocoded addresses. However, both the first mile (pickup from a client or a distribution center, for example) or the last mile (delivery to a house or business) are impacted significantly by the inability to resolve an address, whether a bike, truck or a bicycle is used for performing that operation.  

Case Study 2: Loan and Financial Services

India is a credit-deprived country where 642 million people, a staggering 53%, are excluded from formal financial products such as loans, insurance and other forms of credits and financial services. Even among those who are engaged in trades or small businesses, 48% cannot access formal credits or loans. The impact on the economy is significant. McKinsey estimates that the payoff for digital financial services in India by 2025 can be $700 billion and it can create an additional 21 million jobs [6].

The reasons for the paucity of credit are many: lack of verifiable identity (akin to social security number in the USA), absence of proof of formal income in a largely cash-driven economy and complexity of disambiguating one’s location, be it home, or place of business.

This has started to change in the past few years. Government’s initiative to provide biometric identity to all Indians (“Aadhaar”), has for the first time in history, given over 95% Indians a verifiable identity. Additionally, initiatives to open over 100 million bank accounts for the financially disadvantaged and push for digital transactions have pursued many startups to consider providing loan and credit services to the formerly unserved population.

Consequently, in the past two years, funded by large venture capital investments in financial technology (“fintech”) companies, over 100 startups have started providing services for connecting borrowers and lenders.

Figure 5 shows a typical process promised by one of such services. The process is reasonably straightforward. Once a user applies online or through the app and selects a product, they are typically asked for 5-8 sets of documents:

  • A proof of identity such as Aadhar, passport or a voter ID card
  • A proof of address such as a lease or house ownership documents, i.e., sales deed
  • Proof of income such as paycheck, tax returns or business earning
  • Proof of educational qualifications, particularly for students
  • Proof of age for loan eligibility
  • Employment verification, such as a certified letter from the employer
  • Bank statements
  • Additional documentation around proof of residence (utility bills such as phone, water or electricity) or a letter from the employer in the official letterhead, especially if original identity documents such as Aadhaar card or passport have been issued in another state
  Figure 5: Typical loan generation process promised by one of India’s many digital loan or financing start-ups

Figure 5: Typical loan generation process promised by one of India’s many digital loan or financing start-ups

A courier picks these documents from the borrower. These documents are then scanned and saved in a database, from which it is compared against the information provided by the borrower on the loan application. Since the documents are typically paper-documents, an Optical Character Recognition (OCR) system is used for directly transcoding the information to a database.

This process works well for about 60-70% of borrowers, especially in large cities. From our survey with leading firms, we estimate that about ~70% of the documents are considered a “match” and go to the next step for loan processing, e.g, loan eligibility analysis, approval of loans etc., albeit with only a certain percentage of applicants being eligible for loan.

However for the ~30% of the applications, the address provided by the applicant in the application does not match the documents. To understand why, consider the following addresses for the same house, in Figure 6.

  Figure 6: Same address written in multiple formats in different documents

Figure 6: Same address written in multiple formats in different documents

Consider the different variations:

  • The house number has been written in three formats. “TH-146B”, “146”, as well as “Unit 146”
  • The community has been described as both “Purva Parkridge” and “Purva Park Ridge”, and has been abbreviated as “PPR” again, in three different ways
  • The road name has been spelled in two ways as “Goshala Road” and “Ghosala Road” and omitted in one all-together
  • The locality is described as “Garuda Char Palya” and “Garuda Charpalya”. In one document, a neighboring community, “Mahadevpura” has been substituted

The one in Aadhar card seems to have taken the path of safest approach, adding both the localities of “Mahadevpura” and “Garuda Charpalya” to the same address. In other words, just four different sources of official address verification documents can produce over 50 combinations.

It is therefore not surprising that the addresses provided in documents often do not match. Since the processing of these information happen in a centralized facility, people there would have no idea about “Purva Park Ridge” and “PPR” being same community or “Garuda Char Palya”, with its different ways of spelling is often interchanged with its neighbouring community “Mahadevpura”

For the 30% of documents that do not match, the following process kicks in, as depicted in Figure 7.

  Figure 7: 30% of the applicants whose addresses do not match directly are either asked for additional documentations or have their addresses manually verified, adding to the time and cost for the service providers

Figure 7: 30% of the applicants whose addresses do not match directly are either asked for additional documentations or have their addresses manually verified, adding to the time and cost for the service providers

This results in delayed approval of the loans by up to 5 days in best cases and even weeks or months sometimes. This affects both the borrower, who could be in urgent need of money; and the lender, who has to bear the loss of interest he could have earned until the loan is finally processed and the cost of additional verifications.

  Table 3: Bad addresses delay verification and approval resulting in the loss of interest to moneylenders

Table 3: Bad addresses delay verification and approval resulting in the loss of interest to moneylenders

Also, the place of dwelling or the  business being a key factor in the risk-assessment process of a loan, the inability to disambiguate it shows up in the risk models, raising the rate and hence, the overall cost of the loan.

Pan-India Economic Impact

We conducted similar analysis for the top three industries- Logistics, Manufacturing (including consumer goods) and Emergency Services to derive a cost-estimate for India. Using this approach, our estimate indicate that poor addresses cost India $10-14B annually, ~0.5% of the GDP; see Table 4.

  Table 4: The economic cost of bad addresses in India

Table 4: The economic cost of bad addresses in India

Further note that the numbers presented in Table 4 capture the cost of bad addresses, but do not include additional benefits of having better addresses like rising productivity and income gains, which lead to further growth of businesses and GDP etc.

Conclusion

Easily discoverable addresses are important for rapidly growing economies like India. Rather than just being a convenience, addresses are vital for driving a self-reinforcing economic cycles and therefore, improving livelihood and incomes for the next billion Indians. The consumers independently identify and adopt addresses for their own convenience while the businesses use technology or third-party services to resolve these addresses into geocodes to deliver products and services at reduced costs. However, the current addressing system in India does not lend itself to disambiguation to a reasonably accurate lat-long for most addresses.

Our case-study analyses indicate that the lack of a good addressing system costs India at least $10-14B a year, or about 0.5% of its annual Gross Domestic Product. As the Indian economy continues to grow in both economic output as well as variety of  new businesses and services, the costs due to lack of a proper addressing system will increase significantly. India therefore needs to consider a dramatically new approach to modernize the addressing system to bring in efficiency.

About the authors:

Dr. Santanu Bhattacharya is scientist collaborating with Camera Culture Group at MIT Media Lab. A serial entrepreneur who has led Emerging Market Phones at Facebook, he is a former physicist from NASA Goddard Space Flight Center.

Sai Sri Sathya is a researcher collaborating with REDX and Camera Culture Group at MIT Media Lab and formerly at the Connectivity Lab at Facebook, focused on Emerging World Innovations.

Dr. Kabir Rustogi leads the Data Science team at Delhivery, India’s largest e-commerce logistics company. A published author, he was previously a Senior Lecturer of Operations Research at The University of Greenwich, UK.

Dr. Ramesh Raskar is Associate Professor at MIT Media Lab and leads the Emerging Worlds Initiative at MIT which aims to use global digital platforms to solve major social problems.

References

[1] Startup What3words Aims To Give Billions Of People One Thing They Don't Have https://www.forbes.com/sites/rebeccafeng/2016/06/11/new-company-aims-to-give-billions-of-people-one-thing-they-dont-have-an-address/#2a6a8aa92b3c

[2] Sample from Delhivery’s database of 10 million addresses

[3] Morgan Stanley Bets On Digital To Forecast $6 Trillion Economy, Sensex At 1,30,000 https://www.bloombergquint.com/markets/2017/09/28/morgan-stanley-bets-on-digital-to-forecast-6-trillion-economy-sensex-at-130000

[4] Morgan Stanley Report “India’s Digital Future” http://www.morganstanley.com/ideas/digital-india

[5] Private conversation with multiple stakeholders at Amazon, FedEx and Staples e-commerce

[6] How digital finance could boost growth in emerging economies (September 2016) https://www.mckinsey.com/global-themes/employment-and-growth/how-digital-finance-could-boost-growth-in-emerging-economies

Comment