- formatting
- images
- links
- math
- code
- blockquotes
- external-services
•
•
•
•
•
•
-
Unicode string length can mean different things in different languages
I was working on a text processing example across several different programming languages, including C++, Java, Rust, and Scala, and noticed some discrepancies in the results.
It turned out that these are due to Unicode string length meaning different things in different languages:
In Java, Scala, etc., the
length()
method returns the number of abstract, high-level characters (glyphs) from a human reader's point of view.By contrast, in C++, Go, and Rust, the equivalent functions and methods return a result based on the number of bytes required to store those characters.
jshell> "résumé".length() $1 ==> 6
❯ evcxr Welcome to evcxr. For help, type :help >> "résumé".len() 8 >> "résumé".chars().count() 6
len([]rune("résumé")) // returns 6
Apparently it's a bit more complicated in C++.
-
Running the Editcp DMR codeplug editor binary on a Mac
I've been looking for a painless way to run Dale Farnworth's excellent Editcp DMR codeplug editor on my Mac, mostly because it's more convenient than additionally pulling out my rarely used Linux laptop.
I vaguely recall trying a few years ago to build it natively on the Mac and this being very complex and ultimately failing. So today I tried a different approach: running Editcp in one of my existing VMware Ubuntu Linux VMs.
In short, I'm delighted to report that this works for me once I installed XQuartz on the Mac and libqt5gui5 on Linux!
My environment:
- MacOS Sonoma
- VMware Fusion
- Ubuntu 22.04 LTS 64-bit VM
Steps for installing and running:
- brew install xquartz
- ssh -X ubuntuvm.local
- sudo apt install libqt5gui5
- /opt/bin/editcp
- choose to connect the USB device to Linux
-
JaCoCo doesn't directly support vacuously true 100% branch coverage
Background
I'm an educator and share most of my thoughts just with my students. Once in a while, I have something to share that might help a wider audience and decided to try that here.
Measuring code coverage with JaCoCo
Code coverage is a metric that indicates how thoroughly we're testing. JaCoCo is a mature, actively developed code coverage tool for Java and other JVM-based languages. For each type of coverage, such as lines, branches, etc., it keeps track of covered and missed items and generates a report with the corresponding coverage percentages.
Minimal example in Java
Here is a minimal SUT (system under test) in Java:
public class HelloWorld { public String getMessage() { return "hello world"; } }
Here is a JUnit assertion for it:
assertEquals("hello world", new HelloWorld().getMessage());
And this is the resulting coverage report:
[info] Test run started (JUnit Jupiter) [info] Test hw.TestHelloWorld#getMessage() started [info] Test run finished: 0 failed, 0 ignored, 1 total, 0.136s [info] Passed: Total 1, Failed 0, Errors 0, Passed 1 [info] [info] ------- Jacoco Coverage Report ------- [info] [info] Lines: 100% (>= required 90.0%) covered, 0 of 2 missed, OK [info] Instructions: 100% (>= required 80.0%) covered, 0 of 5 missed, OK [info] Branches: 0% (< required 100.0%) covered, 0 of 0 missed, NOK [info] Methods: 100% (>= required 100.0%) covered, 0 of 2 missed, OK [info] Complexity: 100% (>= required 100.0%) covered, 0 of 2 missed, OK [info] Class: 100% (>= required 100.0%) covered, 0 of 1 missed, OK [info] [info] Check /Users/laufer/Work/teaching/cs335/hello-java-sbt/target/scala-2.12/jacoco/report for detailed report [info] [error] java.lang.RuntimeException: Required coverage is not met ...
According to JaCoCo, the required coverage threshold is not met because zero branches were covered.
How does JaCoCo calculate this?
In JaCoCo's
CounterImpl
class, the coverage percentages are calculated as follows:
public double getCoveredRatio() { return (double) covered / (missed + covered); }
So, when there is nothing to be covered or missed, the ratio is
Double.NaN
("not a number"), representing the division of 0 by 0 in this case.Discussion
If there are no branches to be covered, shouldn't coverage be automatically (vacuously) 100%? From a discrete math perspective, yes:
If there is no work to be done, then the work is already 100% done.
This corresponds to the understanding in mathematical logic that a universally quantified predicate over an empty set (of any element type) is always true (even if the predicate itself is always false):
scala> Set.empty[String].forall(x => false) val res0: Boolean = true
It's easy to implement this behavior by adding a case distinction:
public double getCoveredRatio() { if (missed == 0) { return 1; } return (double) covered / (missed + covered); }
After working on this for a bit, I finally found the pertinent closed issue from 2015. The closing comment suggests that they are favoring a continuous math perspective based on the ratio calculation shown above, which returns
NaN
even if there are no items missed. They closed comments on this issue, and I don't think they'd consider reopening it.Arguably, this is a leaky abstraction that is inconsistent with the definition of code coverage from a discrete math perspective.
Why is this even relevant?
Doesn't all real-world code include branches, such as conditionals or switches?
Well, not necessarily. Arguably, the fewer the better from a readability and maintenance point of view. Event-based systems, such as this Android stopwatch, might not need explicit branching in their control flow when using structural techniques instead, such as attaching listeners to different event sources and applying the State pattern to implement the underlying state-based behavior.
Conclusion
This is obviously a rather minor issue but regularly throws off my students. The path of least resistance seems to be changing the JaCoCo branch coverage threshold in your build configuration: If your SUT happens to have no branches, set the branch coverage threshold to 0, otherwise set it to the desired percentage.
jacocoReportSettings := JacocoReportSettings() .withThresholds( JacocoThresholds( ... branch = 0, ...
-
Konstantin Läufer on LinkedIn: Computer Science, Assistant Professor (Algorithms, Languages, Formal…
#LoyolaChicago #ComputerScience invites applications for a full-time, tenure-track Assistant Professor position beginning Fall 2024 in the areas of…