How to Remove Duplicates using UPDATE

When working with large companies or prospects lists on dashdash, you might generate multiple instances of the same records that you want to remove to clean up your list. While dashdash currently doesn’t offer a feature to remove duplicates, you can easily do so using the UPDATE function.

Let’s assume we have this spreadsheet with two views; Table and New table. The Table view has a table with four entries.

The UPDATE function has five parameters; four required ones and an optional one.

  • source_header: The header range of the table where data will be updated or inserted from. It can be a single row or column.
  • keys: The number of columns/rows, counting from the left/top, to be used as keys when searching for unique rows/columns, respectively.
  • data: The cell or range with data to be updated or inserted.
  • destination: The header range of the table where data will be updated or inserted to. It can be a single row or column.
  • auto (optional): The flag that determines if more headers are added automatically. By default, it’s set to true.

To create the new table without duplicates, we’ll use UPDATE as follows:

C1: =UPDATE(A1:B1,1,A2:B5,‘New table’!A1:B1)

Let’s break down the parameters:

  • source_header: This is the header of the original table. We’ll set it to A1:B1.
  • keys: The number of columns from the left we’re using as unique keys. We’re setting this to 1, which means the “Company name” will be our unique identifier, meaning no two entries can have the same company name.
  • data: The entries of our original table found in A2:B5 or just A2:B.
  • destination: The location of our new table. Since we’re adding it to the first row of the other view, we’ll set this is 'New table'!A1:B1.

This will create a new table with only two entries, as seen below. Since “Company name” is the key, only one instance of “dashdash” and one instance of “Google” are shown.

Now, if we change the keys parameter to 2.

C1: =UPDATE(A1:B1,2,A2:B5,‘New table’!A1:B1)

In this case, we’ll be getting three entries. The reason is that we are checking the rows that contain the same values in both the “Company name” and “domain” columns. While the “dashdash” entries here share the same company name, they’re not seen as duplicates in this case since they have different values in the “Domain” column.

4 Likes