Why use distinct in mysql
Submit Next Question. By signing up, you agree to our Terms of Use and Privacy Policy. Forgot Password? This website or its third-party tools use cookies, which are necessary to its functioning and required to achieve the purposes illustrated in the cookie policy. By closing this banner, scrolling this page, clicking a link or continuing to browse otherwise, you agree to our Privacy Policy.
Popular Course in this category. Course Price View Course. Free Data Science Course. This must have resulted in the value being NULL in our table. What is this NULL value? A NULL value means that the field has no value present. For this, we will use the following query:. These values are passed to the count function which counts the number of unique values and returns the result-set as follows:.
Finally, I get nervous when I see "group by" used as a substitute for distinct, not because it's "wrong", but because stylistically, I believe group by should be used for aggregate functions.
That's just a personal preference. In your example distinct and group by do the same thing. I think your colleagues means that your query should not return duplicates in the first instance and that you should be able to write your query without a distinct or group by clause. You maybe be able to reduce the duplicates by extending your join conditions.
Ask them why is it a bad practice. A lot of people make up rules or come up with things that they consider bad practice from reading the first page of the book or the first result of a google search.
If it does the job and doesn't cause any issues there is no reason to create more work by finding alternatives. From the two options you have posted I would use distinct too because its shorter and easier to read and maintain. If you're querying a table that is expected to have repeated values of some field or combination of fields, and you're reporting a list of the values or combinations of values and not performing any aggregations on them , then DISTINCT is the most sensible thing to use.
Rather, you should figure out the cause of the bug and fix it. Yes, Distinct tends to raise a little alarm in my head when I come across it in someones' query.
It is required in some cases ofcourse, but most data models should not require it. It tends to be a last resort, or outlier case, for having to use it. It may also be systemic of a bad application sitting ontop of the database, allowing duplicate entries to be inserted or updated to be duplicates and likewise, no corresponding database level constraints to prevent such actions.
So the first thing to check is the data. It could be a sign of bad datamodel design. But most likely the query should not get to that stage in a select where duplicate rows are lingering. In constructing a large query, normally I would start with the nugget of a subquery which is specifying the unique fields, and any subquery after that must Inner join or Left join onto that but never add or reduce the number of rows already defined by the nugget query..
So for example, the nugget query could select the right rows also by using Partitions to, for example, select the most recent row of a joined table, or to do some other grouping at that stage. In your example, I would not expect duplicates. If a person can have historical addresses, fine, but then do you need to see all addresses, or only the most recent, and if there were duplicate addresses, for the same person, does that mean incorrectly duplicated data, or does it mean the person left that address but returned to it later This means that all other data hangs off this nugget of a sub query..
The following SQL statement selects all including the duplicates values from the "Country" column in the "Customers" table:. Note: The example above will not work in Firefox! Firefox is using Microsoft Access in our examples.
Select all the different values from the Country column in the Customers table.
0コメント