Country & City Dimension
Country
The COUNTRY table serves as a comprehensive repository, capturing essential attributes and details about countries globally. It encompasses unique identifiers, standardized codes, geographical data, and practical information.
- COUNTRY_CD:
- Description: A short code representing the country.
- Sample Value: AD
- Example: The code AD stands for Andorra.
- ISO:
- Description: A two-letter standardized country code.
- Sample Value: AE
- Example: The code AE represents the United Arab Emirates.
- ISO3:
- Description: A three-letter standardized country code.
- Sample Value: AND
- Example: AND is the three-letter code for Andorra.
- ISO_NUMERIC:
- Description: A numeric representation of the country code.
- Sample Value: 020
- Example: The number 020 corresponds to Andorra.
- FIPS_CD:
- Description: Federal Information Processing Standards code for the country.
- Sample Value: AN
- Example: AN is the FIPS code for Andorra.
- COUNTRY_NAME:
- Description: The full name of the country.
- Sample Value: Afghanistan
- Example: The country name “Afghanistan” corresponds to the codes AF, AFG, and 004.
- CAPITAL:
- Description: The capital city of the country.
- Sample Value: Kabul
- Example: Kabul is the capital city of Afghanistan.
- AREA:
- Description: The total land area of the country in square kilometers.
- Sample Value: 647,500
- Example: Afghanistan covers an area of 647,500 sq km.
- POPULATION:
- Description: The total population of the country.
- Sample Value: 37,172,386
- Example: Afghanistan has a population of approximately 37.17 million.
- CONTINENT_CD:
- Description: The continent code where the country is located.
- Sample Value: AS
- Example: Countries like Afghanistan are located in Asia.
- CONTINENT_NAME
- Description: The continent name where the country is located
- Sample Value: Europe
- TLD:
- Description: The top-level domain associated with the country.
- Sample Value: .af
- Example: Websites from Afghanistan might end with .af.
- CURRENCY_CD:
- Description: The code for the country’s official currency.
- Sample Value: EUR
- Example: EUR stands for the Euro used in Andorra.
- CURRENCY_NAME:
- Description: The name of the country’s official currency.
- Sample Value: Euro
- Example: Euro is the official currency of Andorra.
- PHONE:
- Description: The phone code for the country.
- Sample Value: +1-268
- Example: +1-268 is the phone code for Antigua and Barbuda.
- POSTAL_CD_FORMAT:
- Description: The format used for postal codes in the country.
- Sample Value: AD###
- Example: In Andorra, postal codes follow a format like AD123.
- POSTAL_CD_REGEX:
- Description: A regular expression pattern for validating the country’s postal code.
- Sample Value: ^(?:AD)*(\d{3})$
- Example: This regex pattern matches the Andorran postal code format AD###.
- LANGUAGES:
- Description: An array of official languages spoken in the country.
- Sample Value: [“en-AG”]
- Example: English (en-AG) is the official language of Antigua and Barbuda.
- NEIGHBORS:
- Description: An array of neighboring countries.
- Sample Value: [“SA,OM”]
- Example: The United Arab Emirates has neighbors like Saudi Arabia (SA) and Oman (OM).
- EQUIVALENT_FIPS_CD:
- Description: An equivalent FIPS code for the country.
- Sample Value: (No sample provided in the dataset)
- Example: This might be an alternative or extended FIPS code for a region within a country.
Create 5 questions that someone would like to ask against the data and 3 sample queries to answer those questions, e.g. give me all of the countries in continent Europe
- Which countries have more than three neighboring countries?
1 2 3 |
SELECT COUNTRY FROM COUNTRY WHERE ARRAY_SIZE(SPLIT(NEIGHBOURS[0], ',')) > 3; |
- Which countries in Europe have a population less than 1 million?
1 2 3 4 |
SELECT COUNTRY, POPULATION FROM COUNTRY WHERE CONTINENT = 'EU' AND POPULATION < 1000000 ORDER BY POPULATION; |
- How many countries have more than one official language?How many countries have more than one official language?
1 2 3 |
SELECT COUNT(*) AS NumberOfMultilingualCountries FROM COUNTRY WHERE ARRAY_SIZE(split(LANGUAGES[0], ',')) > 1; |
- Which continents have the highest and lowest total population?
- What are the predominant languages in Asia?
City
The table provides a detailed overview of cities, combining geographical and administrative data. Precise geographical coordinates, including latitude, longitude, and elevation, are stored alongside unique identifiers like ‘GEO_ID’.
- COUNTRY_CD:
- Description: A short code indicating the country of the city. Foreign key to the Country table
- Sample Value: NP
- Example: The code NP stands for Nepal.
- CITY_NAME:
- Description: The name of the city.
- Sample Value: Jambari
- Example: Jambari is the name of a city or location in the dataset.
- ASCII_NAME:
- Description: The ASCII representation of the location’s name.
- Sample Value: Jambari
- Example: This ensures the city name can be read in systems that only support ASCII characters.
- LATITUDE:
- Description: Latitude coordinates of the location.
- Sample Value: 51.50853
- Example:
- LONGITUDE:
- Description: Longitude coordinates of the location.
- Sample Value: -0.12574
- Example:.
- POPULATION:
- Description: Indicates the population of the city.
- DEM:
- Description: Digital elevation model, srtm3 or gtopo30, average elevation of 3”x3” (ca 90mx90m) or 30”x30” (ca 900mx900m) area in meters, integer. srtm processed by cgiar/ciat.
- Sample Value: 88
- TIMEZONE:
- Description: The time zone in which the city is located.
- Sample Value: Asia/Kathmandu
- Example: The city follows the time zone of Kathmandu, Nepal.
- MODIFICATION_DT:
- Description: The date when the record was last modified.
- Sample Value: 2019-10-17
- Example: The details of this city were last updated or modified on October 17, 2019
Sample Queries
- Which are the top 2 cities with the highest population in each of the countries?
1 2 3 4 5 6 7 8 9 10 11 12 |
select V_CITY.COUNTRY_CD, V_COUNTRY.COUNTRY_NAME, V_CITY.CITY_NAME, V_CITY.POPULATION from V_CITY inner join V_COUNTRY on V_CITY.COUNTRY_CD = V_COUNTRY.COUNTRY_CD QUALIFY row_number() over (partition by V_CITY.COUNTRY_CD order by V_CITY.POPULATION desc) <= 2; |
- Which country has the most cities listed in the table?
1 2 3 4 5 6 7 8 9 10 11 12 |
select V_CITY.COUNTRY_CD, V_COUNTRY.COUNTRY_NAME, COUNT(*) AS NUMBER_OF_CITIES from V_CITY inner join V_COUNTRY on V_CITY.COUNTRY_CD = V_COUNTRY.COUNTRY_CD group by V_CITY.COUNTRY_CD, V_COUNTRY.COUNTRY_NAME QUALIFY ROW_NUMBER () OVER (ORDER BY NUMBER_OF_CITIES DESC)= 1; |
- Top 10 countries from the provided list which has the highest average recorded population across its cities
1 2 3 4 5 6 7 8 9 10 11 12 |
select V_CITY.COUNTRY_CD, V_COUNTRY.COUNTRY_NAME, AVG(V_CITY.POPULATION) AS AVERAGE_POPULATION from V_CITY inner join V_COUNTRY on V_CITY.COUNTRY_CD = V_COUNTRY.COUNTRY_CD group by V_CITY.COUNTRY_CD, V_COUNTRY.COUNTRY_NAME QUALIFY ROW_NUMBER () OVER (ORDER BY AVERAGE_POPULATION DESC) <= 10; |
Joining City and Country Tables:
- Which countries have more than 100 cities listed?
1 2 3 4 5 6 |
SELECT CO.COUNTRY, COUNT(*) as NUMBER_OF_CITIES FROM V_COUNTRY CO JOIN V_CITY CI ON CO.COUNTRY_CD = CI.COUNTRY_CD GROUP BY CO.COUNTRY HAVING COUNT(*) > 10000 ORDER BY NUMBER_OF_CITIES DESC; |
- Which country with English as one of its official languages has the most cities listed?
1 2 3 4 5 6 7 8 9 10 11 12 13 |
select V_CITY.COUNTRY_CD, V_COUNTRY.COUNTRY_NAME, COUNT(*) AS NUMBER_OF_CITIES from V_CITY inner join V_COUNTRY on V_CITY.COUNTRY_CD = V_COUNTRY.COUNTRY_CD WHERE TO_VARIANT(LANGUAGES)::STRING ILIKE '%en%' group by V_CITY.COUNTRY_CD, V_COUNTRY.COUNTRY_NAME QUALIFY ROW_NUMBER () OVER (ORDER BY NUMBER_OF_CITIES DESC)= 1; |