SQL interview questions
Q1 What are joins in SQL?
A join is an operation that is used to combine data from multiple tables into a new table. Different types of joins specify how data between tables are matched into the new table. When you need to retrieve data from multiple tables in a single query, there’s a good chance that you’ll be using a join operation.
Interviewers ask this question as a way to test your fundamental understanding of how data is queried and how it’s necessary for almost every kind of application. We cannot simply dump all data into one table as it will get unwieldy and poorly organized.
Example: We would want to store a table of customers (with their name, address, company they represent, etc.) separately from a table of transactions (with the items purchased, when the transaction was made, how much items cost, who made the purchase, etc.).
When we store the information of who made the purchase, we would not want to duplicate all of the customer’s information into every row of the transactions table, and we would only want to store the customer ID.
In order to answer a question such as “how many purchases were made by customers who live in a specific zip code?”, we would need to join the two tables to get this answer:
SELECT COUNT(*) FROM customers c JOIN transactions t ON t.customer_id = c.id WHERE c.zipcode = 94107;
Q2 What is the difference between DELETE and TRUNCATE statements?
This is one of those SQL interview questions designed to gauge your understanding of how database rows are stored and managed internally.
While DELETE and TRUNCATE can both be used to remove all data from a table, the database processes these queries differently. DELETE operations can filter rows that are targeted as it supports a WHERE clause, whereas a TRUNCATE operation removes an entire table.
When deleting a whole table with these two operations, TRUNCATE performs faster at the expense of being unable to perform some operations that rely on the transaction log. For example, some SQL servers can rollback DELETE operations and not TRUNCATE operations.
In general, you should opt to use the DELETE operation due to benefits such as a history in the transaction log. In situations where a DELETE operation on an entire table takes too long, you can look into using the TRUNCATE operation. An interviewer would be interested in hearing about the tradeoffs between the two operations.
Q3 What is the difference between a primary key and unique key?
Primary keys must be unique and are used to identify table records, whereas unique keys serve as constraints in the table’s data. Understanding unique keys shows the interviewer that you understand some of the important ways to maintain a database table’s data integrity.
For example, when you’re designing a table to track users, you may have a column for user_id as the primary key. user_id will be unique and is used to reference the rest of the data in a row. To maintain data quality, you may have a column for phone_number as a unique key to ensure that only one of each phone number can exist in the table.
Q4 What is the purpose of a foreign key in SQL?
Foreign keys are columns used to reference primary keys in other tables. This helps set a constraint on the column to enforce referential integrity across tables. Foreign keys are a fundamental concept in database table design, and interviewers will want to verify that you understand them and have used them before in the past. Understanding them shows that you know how to design schemas that span across multiple database tables.
For example, imagine that you have two tables: orders and users. Every order should have been created by a user so the orders table can have a foreign key to the user table’s primary key. This constraint ensures that every user defined in the orders table is referencing a valid row.
Q5 What are some ways to optimize a query?
Query plans are a useful way to examine how a query will be performed, and can help you better understand what is making your query slow.
For example, prepending a query in PostgreSQL with EXPLAIN will show the query plan for the command. This will reveal the table scans that will be involved in the query.
So if we wanted to check the behavior of the query: SELECT * FROM table_1;, we can run the query EXPLAIN SELECT * FROM table_1;. From there, the query plan will break down the various steps that will be taken to execute the query. It’s a common way to identify unnecessary full table scans that could be alleviated with setting up proper indices.
These techniques are important because interviewers want to understand how you may approach troubleshooting database queries. If a candidate only knows how to run a query against the database to measure performance, it shows a lack of familiarity and industry experience with using relational databases.
Q6 What are some ways to identify how a query can be optimized?
Queries can be optimized in many ways. A few common examples are as follows:
Reduce amount of data to query with WHERE clauses.
Limit the amount of useful rows the database needs to query with a LIMIT clause.
Add an index on columns that are frequently queried.
The goal for this question is often not to just list every example above. Since this question is open-ended, some interviewers may present this question with a scenario where certain optimizations can be identified and applied.
Q7 What is normalization and what are the advantages of it?
Database normalization is a strategy to efficiently organize data in a database. The goal is to reduce redundant data so that the same data is not stored across multiple tables. Instead, data will be referenced with a primary key.
For example, a table named orders may have a column named user_id. Rather than maintaining a copy of user data in the orders table, we can simply reference that data from another table with a join.
This is often asked to gauge a candidate’s understanding of table design. Normalizing data is a key component of designing table schemas in relational databases.
Q8 What are entities and relationships?
An entity is an abstraction of a set of related data and are represented as tables. Relationships define how entities are associated with one another.
For example, let’s say we have two tables named orders and users. Our orders and users are our entities. We can imagine that one user could possibly have many orders. Therefore, users can have a “one-to-many” relationship with orders.
Entities and relationships are often used in the table schema design process. By understanding how to define entities and map their relationships, you show that you can be productive in a team’s collaborative session on database table design.
Q9 How can you set up a table so that queries for certain rows won’t result in full table scans?
Full table scans can be possibly avoided by querying on indexed columns and using limits. Indexed columns help the database optimize how it performs lookups on the tables.
There are many strategies for reducing the rows queried and these can be confirmed with query plans. Interviewers will often ask this SQL question to gauge your understanding of how to properly design SQL tables and optimize queries.