|
|
Crystal Reports:
Remove Duplicate Values
One common report layout is a series of
columns across the page to display some important data.
If you want to format the report so
that duplicated values are not repeated down the page, use the field
formatting option.
If it is in the Details section, you
can:
-
Right click on the field, and select
the “Format Field” menu.
-
Tick the Box that says: Suppress if
duplicated
If it is in the Group Header section,
you need to use Conditional Suppress: Click on the X-2 button beside the
Suppress option on the Object Formatting and enter a formula of
{table.field}
= Previous({table.field}).
If it is in the Group Footer section,
sadly this same logic does not work. We will need a formula with variables
to apply this to the group footer.
Now, this brings an important question to mind. Exactly why are
you using a reporting system to display data? A properly
designed report will provide the answers users seek, deriving those answers from
the data. That's really the difference between data and information. And your
report should provide information.
If your report is merely a fancy way of
presenting raw data, or if users want data instead of your report, there is
something wrong with your report.
To fix this problem, ask users what business questions
they need answered. They should not be spending their time manipulating data.
They should have their business questions answered by your report, so they can
do the work they are paid to do.
It may be true, however, that your relational database
-based reporting system can't provide the analysis abilities your users need. If
that is the case, consider implementing an OLAP system.
All the time users spend playing with spreadsheets is not
going to provide answers your reporting system can't provide, as spreadsheets
are a step down from relational databases. Users probably don't realize they are
going backwards, not forwards.
It's also important to ensure the data you do have are accurate. Here some
factors to consider and to work with your database administrator on. That work
often involves educating those who provide the information for the database.
- Accuracy vs. precision. Sometimes, this reaches the
level of absurdity. A classic example is "2.3 people own 2.1 or
more cars." When numbers don't make sense, investigate why and
correct the problem. A Crystal Report that shows 3.6 sales
representatives sold 47.7 or more homes last year is not going to enjoy
a high level of confidence among the end-users.
- Age. Old data can greatly skew the report results.
This is especially true in cases of comparison. For example, Company X
is looking at the performance results of 11 divisions, but only two of
these have this week's data and two others have data over a year old.
Create some mechanism that flags old data for managerial attention.
- Consistency. You cannot draw meaningful trends from
inconsistent data. Consider the U.S. Census Bureau. They will identify a
group of people for study for a defined time. The composition of the
group does not change. Every month, every person in the group gets
interviewed with the same set of questions (these do not change with
each month, either). At the end of the study, the Census Bureau can
honestly say people make X decisions Y percent of the time. If the group
under study had changed, this would not be possible. This concept is a
foundation of statistical analysis.
- Inclusion/Exclusion. What data are included or excluded in the report?
Leaving out crucial information can produce a misleading report, as can
adding too much irrelevant data.
If the data in the report are from multiple tables, are the tables
joined correctly? The absence of some data, or a customer or product
that is new to the database, or is now obsolete can seriously affect
what data appear in the report. Build your report up slowly and as you
add each table to the report, check the record count. Use “Distinct
Count” to see how many different values you get for a primary key
field.
If you are developing a customer report, and there are 421 records in your report, then the distinct count of CustomerID should also be 421. Add the transaction table to the report, and your record count might go up to 1247, and the distinct count may drop to 380. There are now 41 customers missing from the report because they don’t have any transactions. Is that what you want? Continue this process as you design your report, confirming each step along the way.
- Relevance. While not really an accuracy concern,
data of low relevance can have a negative impact on the end results.
Administering and controlling such data uses up resources. Consider the
example of a marketing company that was sending out surveys to shoppers.
These surveys were complex and required about 5 hours per week per
shopper to fill out. People got sick of this, and--just to get through
the survey--entered wrong information. When a new director came in, she
took one look at the survey and said, "This is insane!" She
asked her bosses what it was they most wanted to know, and told them
they had to limit that to what could be obtained in 15 minutes per week
from a shopper. The new survey, sent to the same people, show markedly
different responses to the same data requests. The functional concept
here: "Overload doesn't work."
- Source validation. Are the data in your report consistent with data
from another, trusted source? Have you even checked? Third party
verification can be very helpful. But, don't change just because the
other source is different. Figure out why the differences exist.
- Survey design. This is a science in itself.
However, you need to examine surveys to see if they proved leading
questions, biased multiple choices, allow self-selection, or in some
other way lead to false results. The great Dr. Bill Blanchard, a
statistics guru, said that if you find one survey out of 100 that was
designed properly, you beat the law of averages.
- Variables isolation. The more variables you have in
your data, the less reliable they are.
For example, a racing team wanted
to determine which secondary carburetor jet size was best for a given
temperature and humidity profile. They collected data for an entire
racing season. The next season, they used the results to size their
jets. Their performance went down, not up. Why? Because they had changed
fuel pump pressure, tire sizes, and even the racing fuel. Thus, the jet
size was only one variable. Had they also tracked these other
variables--plus air pressure and wind speed--they may have used
statistical analysis to properly chart the data. However, they tracked
only the jet size and the resulting time. Lacking the proper variables
isolation, reports based on their data led to false conclusions.
This article is copyrighted by
Crystalkeen, Mindconnection, and Chelsea Technologies Ltd. It may be
freely copied and distributed as long as the original copyright is
displayed and no modifications are made to this material. Extracts are
permitted. The names Crystal Reports and Seagate Info are trademarks owned
by Business Objects.
|