Daniel Zvinca's Avatar

Daniel Zvinca

@danz68.bsky.social

A bit of everything, recreational math, mechanical engineer, programmer, statistics, dataviz.

1,217 Followers  |  199 Following  |  117 Posts  |  Joined: 18.11.2023  |  2.5323

Latest posts by danz68.bsky.social on Bluesky

Numbers and related fields are just a hobby for me, so hearing that any of my posts are helpful to people in dataviz is honestly very flattering.

21.06.2025 15:00 β€” πŸ‘ 2    πŸ” 0    πŸ’¬ 0    πŸ“Œ 0

CSP is always useful for investigating relationships during the exploratory phase of data analysis. The reason it rarely works as an explanatory solution is that we can’t exactly say: "This hard-to-read graph shows there’s no clear relationship using this method..."

21.06.2025 14:19 β€” πŸ‘ 1    πŸ” 0    πŸ’¬ 0    πŸ“Œ 0

Ha, glad to hear the method's been helpful. Measuring or even just describing relationships between variables has been always a tricky business.

21.06.2025 14:07 β€” πŸ‘ 1    πŸ” 0    πŸ’¬ 0    πŸ“Œ 0

Ah, so that’s why my posts/remarks get nearly no traction...🀭

21.06.2025 13:58 β€” πŸ‘ 2    πŸ” 0    πŸ’¬ 1    πŸ“Œ 0
Post image

Are you aware of any data visualization designs that use optical illusions to enhance the intended message?

27.05.2025 10:55 β€” πŸ‘ 7    πŸ” 2    πŸ’¬ 3    πŸ“Œ 1

While traditional beeswarms aim to reduce empty space through tight packing, my method focuses on frequency accuracy, with the space being used incidentally rather than as a packing objective.

06.05.2025 10:03 β€” πŸ‘ 2    πŸ” 0    πŸ’¬ 0    πŸ“Œ 0
Post image

Here is the result of my NEW frequency dot plot, an arrangement based on data density calculation using the dot size as resolution (granularity). This result actually validates the basic beeswarm packing for this dataset. Beeswarms are often poor density estimators due to their packing artifacts.

06.05.2025 09:50 β€” πŸ‘ 2    πŸ” 0    πŸ’¬ 1    πŸ“Œ 0
Preview
Point Frequency Histogram: a bias-free, per-point density estimator | Daniel Zvinca The Point Frequency Histogram (PFH) is a simple, and highly effective visual method for accurately estimating the local density of data points. By defining a window around each data point and counting...

The Point Frequency Histogram (PFH) is a novel (??), simple, and highly effective visual method for accurately estimating the local density of data points.

www.linkedin.com/posts/daniel...

13.04.2025 00:44 β€” πŸ‘ 0    πŸ” 0    πŸ’¬ 0    πŸ“Œ 0

It was an exciting experience meeting so many statistical geeks in one place. Meeting @xangregg.bsky.social in person after more than a decade of social media debates was a particular delight. As he said, we could have talked forever.

15.03.2025 13:02 β€” πŸ‘ 1    πŸ” 0    πŸ’¬ 0    πŸ“Œ 0

Not really a CSP. It is a multi line/dot chart. Time and ages are just shifted variables, no chess player trajectory will have curve twists.

01.03.2025 18:42 β€” πŸ‘ 1    πŸ” 0    πŸ’¬ 0    πŸ“Œ 0

When density continuity is a fact my method uses a different idea than KDE.

KDE uses a quite large bandwidth (constant or not) to calculate each value "contribution" to the density shape.

My method finds the smoothest density shape assuming each value was measured within a given tolerance.

20.02.2025 18:02 β€” πŸ‘ 0    πŸ” 0    πŸ’¬ 0    πŸ“Œ 0
Post image

My "best" guess is in the first image (truly, just a guess, no idea how good that is). Then I try to fit the dots as good as I consider is needed, starting from the optimal pack (hexagonal) to no errors at all. The final shape resemblances my "best" guess, an 0.2x error looking like this:

20.02.2025 16:13 β€” πŸ‘ 0    πŸ” 0    πŸ’¬ 0    πŸ“Œ 0
Post image

I think that the exploratory phase ends when we are confident enough that we found the density shape (iteratively challenging continuity, gaps, tails). Once we get there, we do our best to fit the available data by minimizing the error placement and/or overlapping, depending on the method.

20.02.2025 16:06 β€” πŸ‘ 1    πŸ” 0    πŸ’¬ 2    πŸ“Œ 0

Using the distribution word in a statistical sense for a few discrete values as those from a Likert scale is a challenge. Can you share the data?

20.02.2025 13:06 β€” πŸ‘ 0    πŸ” 0    πŸ’¬ 1    πŸ“Œ 0

Any chance you can share the data?

20.02.2025 12:30 β€” πŸ‘ 0    πŸ” 0    πŸ’¬ 1    πŸ“Œ 0

Not sure how relevant is the rate. No idea how the system works there, but I doubt they can accommodate as many as they want. I would rather encode how many more applicants than the available places are (assuming all taken) with the following title: More and more people are interested in education.

20.02.2025 12:24 β€” πŸ‘ 1    πŸ” 0    πŸ’¬ 0    πŸ“Œ 0

it looks like your display has a few bugs

15.02.2025 11:47 β€” πŸ‘ 1    πŸ” 0    πŸ’¬ 0    πŸ“Œ 0

Cant really follow these graphs, but I am not familiar with data either. I cant see an explicit functional model to study a relationship from the dots.

09.12.2024 16:44 β€” πŸ‘ 0    πŸ” 0    πŸ’¬ 0    πŸ“Œ 0

I think this example would be a fantastic exercise for people with different backgrounds, but with a bit of functional math and statistical knowledge. I am not sure how a two variable relationship dissertation would look to make it accessible to a broader audience, though.

09.12.2024 10:23 β€” πŸ‘ 0    πŸ” 0    πŸ’¬ 0    πŸ“Œ 0

As you said, I kept staring at it. I love math and the logic behind PCA, but I dont see how it would improve the practical insights for a general audience. PCA might simplify mathematical analysis by removing the correlation, yet it doesnt mean it also simplifies human sensemaking.

09.12.2024 05:46 β€” πŸ‘ 0    πŸ” 0    πŸ’¬ 2    πŸ“Œ 0

What I would conclude: 1. The concentric elliptical isolines might be the representation of a bivariate normal distribution. 2. The nearly constant orientation of the ellipses indicates also a linear relationship where values matter the most, in areas with high density.

08.12.2024 19:44 β€” πŸ‘ 0    πŸ” 0    πŸ’¬ 1    πŸ“Œ 0

It wasn't the case here, because of the orientation, but I didn't know how the isolines shapes look. The ellipsis clearly have an angle. The "rotation" spreads the variance across both axes, so rescaling any axis will not convert the ellipse into a circle.

08.12.2024 19:34 β€” πŸ‘ 1    πŸ” 0    πŸ’¬ 1    πŸ“Œ 0
Post image

In a scatterplot, the visual shape of the distribution (ellipse or circle) is influenced by the scaling of the axes. Adjusting the scales can transform a 0 or 90deg oriented ellipse into a circle. This means the elongation might be a plotting artifact rather than a true feature of the data.

08.12.2024 19:18 β€” πŸ‘ 2    πŸ” 0    πŸ’¬ 1    πŸ“Œ 0

I would love to see some isodensity or probability contours on this dataset with adjusted scales based on that. I would not filter out anything, I would just rescale the axis based on the ratio of, say, 90% probability ranges.

08.12.2024 11:58 β€” πŸ‘ 1    πŸ” 0    πŸ’¬ 1    πŸ“Œ 0

I can't really see a functional relationship (not really predictable).

However, the visible ellipse rather shows (kind of) a statistical dependency, but only if the range calculated scales dont foul us (if that would be a circle it would be no dependency).

08.12.2024 11:38 β€” πŸ‘ 1    πŸ” 0    πŸ’¬ 1    πŸ“Œ 0

Irrelevant reply:

"I am not sure if all need to be data encoding projects.

If not: recreate a logo. Describe how much fun you had getting there.

Else: recreate a masterpiece. Don't bother admiring too much the original. Convince your audience that your perspective is valid."

23.11.2024 20:23 β€” πŸ‘ 1    πŸ” 0    πŸ’¬ 1    πŸ“Œ 0

... an obvious insensitivity to local variations (if two years in a row health investments were reduced didn't automatically reflect into life expectancy, not those years, nor 10 years later). When such lag and obvious slow data responsiveness occur, CSP is hardly a solution.

πŸ€·β€β™‚οΈ

21.11.2024 23:27 β€” πŸ‘ 0    πŸ” 0    πŸ’¬ 0    πŸ“Œ 0

Thing is that CSP has often a serious drawback inherited from SP, but rarely mentioned.

In the famous CSP showing Life Expectancy vs Health Expenditure a "little" detail was left out.

That was the obvious cause-effect lag. For this dataset we can easily guess not only a decade gap, but also ->

21.11.2024 23:19 β€” πŸ‘ 0    πŸ” 0    πŸ’¬ 1    πŸ“Œ 0

To be honest this sort of debate resemblances quite well the survival bias, focusing or counting what is well known ignoring the rest just because they didnt occur too often in practice.

Everything I wrote about CSP strengths is legit. It just happens they are not too often reasons for CSP design.

21.11.2024 23:00 β€” πŸ‘ 0    πŸ” 0    πŸ’¬ 1    πŸ“Œ 0

It can be, but how relevant is this? Sure, counting preferences matter, more platforms, even nicer. Yet this doesn't put a value on CSP, but on those poorly designed.

I hardly "like" any dataviz technique. I prefer to choose them well when appropriate.

21.11.2024 22:45 β€” πŸ‘ 0    πŸ” 0    πŸ’¬ 1    πŸ“Œ 0

@danz68 is following 20 prominent accounts