I have two tables in MySQL. Table Person has the following columns:
id | name | fruits
fruits column may hold null or an array of strings like (‘apple’, ‘orange’, ‘banana’), or (‘strawberry’), etc. The second table is Table Fruit and has the following three columns:
____________________________ fruit_name | color | price ____________________________ apple | red | 2 ____________________________ orange | orange | 3 ____________________________ ...,...
So how should I design the
fruits column in the first table so that it can hold array of strings that take values from the
fruit_name column in the second table? Since there is no array data type in MySQL, how should I do it?
The proper way to do this is to use multiple tables and
JOIN them in your queries.
CREATE TABLE person ( `id` INT NOT NULL PRIMARY KEY, `name` VARCHAR(50) ); CREATE TABLE fruits ( `fruit_name` VARCHAR(20) NOT NULL PRIMARY KEY, `color` VARCHAR(20), `price` INT ); CREATE TABLE person_fruit ( `person_id` INT NOT NULL, `fruit_name` VARCHAR(20) NOT NULL, PRIMARY KEY(`person_id`, `fruit_name`) );
person_fruit table contains one row for each fruit a person is associated with and effectively links the
fruits tables together, I.E.
1 | "banana" 1 | "apple" 1 | "orange" 2 | "straberry" 2 | "banana" 2 | "apple"
When you want to retrieve a person and all of their fruit you can do something like this:
SELECT p.*, f.* FROM person p INNER JOIN person_fruit pf ON pf.person_id = p.id INNER JOIN fruits f ON f.fruit_name = pf.fruit_name
The reason that there are no arrays in SQL, is because most people don’t really need it. Relational databases (SQL is exactly that) work using relations, and most of the time, it is best if you assign one row of a table to each “bit of information”. For example, where you may think “I’d like a list of stuff here”, instead make a new table, linking the row in one table with the row in another table. That way, you can represent M:N relationships. Another advantage is that those links will not clutter the row containing the linked item. And the database can index those rows. Arrays typically aren’t indexed.
If you don’t need relational databases, you can use e.g. a key-value store.
Read about database normalization, please. The golden rule is “[Every] non-key [attribute] must provide a fact about the key, the whole key, and nothing but the key.”. An array does too much. It has multiple facts and it stores the order (which is not related to the relation itself). And the performance is poor (see above).
Imagine that you have a person table and you have a table with phone calls by people. Now you could make each person row have a list of his phone calls. But every person has many other relationships to many other things. Does that mean my person table should contain an array for every single thing he is connected to? No, that is not an attribute of the person itself.: It is okay if the linking table only has two columns (the primary keys from each table)! If the relationship itself has additional attributes though, they should be represented in this table as columns.
MySQL 5.7 now provides a JSON data type. This new datatype provides a convenient new way to store complex data: lists, dictionaries, etc.
Arrays don’t map well databases which is why object-relational maps can be quite complex. Historically people have stored lists/arrays in MySQL by creating a table that describes them and adding each value as its own record. The table may have only 2 or 3 columns, or it may contain many more. How you store this type of data really depends on characteristics of the data.
For example, does the list contain a static or dynamic number of entries? Will the list stay small, or is it expected to grow to millions of records? Will there be lots of reads on this table? Lots of writes? Lots of updates? These are all factors that need to be considered when deciding how to store collections of data.
Also, Key:Value data stores / Document stores such as Cassandra, MongoDB, Redis etc provide a good solution as well. Just be aware of where the data is actually being stored (if its being stored on disk or in memory). Not all of your data needs to be in the same database. Some data does not map well to a relational database and you may have reasons for storing it elsewhere, or you may want to use an in-memory key:value database as a hot-cache for data stored on disk somewhere or as an ephemeral storage for things like sessions.
A sidenote to consider, you can store arrays in Postgres.
Use database field type BLOB to store arrays.
Returns a string containing a byte-stream representation of value that
can be stored anywhere.
Note that this is a binary string which may include null bytes, and
needs to be stored and handled as such. For example, serialize()
output should generally be stored in a BLOB field in a database,
rather than a CHAR or TEXT field.
you can store your array using group_Concat like that
INSERT into Table1 (fruits) (SELECT GROUP_CONCAT(fruit_name) from table2) WHERE ..... //your clause here
HERE an example in fiddle