One of my current open source projects utilizes different crypto-libs.
one of them is openssl. a openssl thing for the normal encrypt / decrypt api call is the options field. which basically just defines how the output should return.
During writing the documentation for all the properties / fields, I realized that instead of writing
valid_options = [OPTION_1, OPTION_2, OPTION_3, OPTION_1 | OPTION_2, ....];
if (!valid_options.in(passedOption)) {
complain|return
}
I could just use
option_lower_boundary = 0
option_upper_boundary = OPTION_1 | OPTION_2 | OPTION_3
if (!option.between(option_lower_boundary, option_upper_boundary)) {
complain|return
}
because they use disjunctions for the option parameter and it's binary representation of 0,1, 2, 4, ....
which can be any combination between
0 and 7 which instead of writing all possible combination anything within this range is a valid input.
I obviously think this is way more elegant .... but that's is just my perspective.
So what do you think? Should I be explicit or is using the mathematical concept of ranges better?
To some degree it's the every pure function can be exchanged with a table principle and the trade between "this can be between 0 and 7" or "is it 0,1,2,3,4,5,6,7" which is basically the difference between boundary checks vs explicit value checks ? ....
But the classic problem is implicit bias ... just because it makes sense to me should I leave it like this? I wrote the principle in the documentation and linked the check in the code base to the example how it works. Which is enough for me .... but is it?
Another question is about the documentation, i personally tend to write example code in my documentation because I want to write for beginners.
So I have docblocks above properties containing a larger documentation + examples + links to different sources which explain the words used. Do you think that's a good idea?
Why or why not? :)
thx :) for any feedback
Is just using a parameter for each option a possibility? Or using a configuration object? Probably openssl does not work that way, but I'd consider using that internally and packing all the bits into openssl format at the last step.
Then the check is nicely encapsulated, and if you give it a meaningful name so that its immediately clear what it does, then I feel you can choose the brief, performant option. Asymptotically that's what you'll have to do anyway: you don't want 1024 conditions if there are 10 options, unless many combinations are disallowed.
What you're talking about here are basically "flags" -- and bitwise operations are ALWAYS fastest for flags since you can do comparisons in one operation instead of many.
You want to know if a 32 bit number is 0..7, you just do "flags & 0xFFFFFFF8" and if the result is non-zero, it's out of range. Just like checking for a single value, lets say your third flag (0x04) you just "flags & 0x00000004" and if it's non-zero it's set.
That's why you "or" them together. Admittedly you're limited to whatever your largest integer bit-size is for a variable, but overall it is the fastest approach 'under the hood'.
Comparing to an array is SLOW. MORE so in interpreted languages with loose typecasting. Whilst in C it ends up relatively simple and fast since you just shift a couple times to make the offset match the field size, you get into JavaScript and you don't have real arrays, you have painfully slow pointered lists.
So even ranges are often better in that regard, since a single range check is ALWAYS just two comparisons, whilst iterating through a flat array can be anywhere from one comparison to as many elements are in the array. You have 200 array elements, that's possibly as much as 200 compares.
... which is where a flattened binary tree starts to shine, since then your number of comparisons is the same as the number of bits needed to store the number of elements. You have 256 elements to compare against, a flattened btree would never take more than eight compares.
It's really hard to say more though since I've no idea your usage scenario, WHAT options you are setting, etc, etc. Are these actually flags, or are they values? If they're flags, and you have 32 or less of them, store it as a 32 bit integer and use "and or" to work on them.
... and I like turtles.
j
stuff ;)
Well, I think you are hitting a caught-between-worlds problem. In low-level languages, like C, one often uses bit-fields in order to communicate options in a lean way. One integer usually is the most performant variable type, and using bitfields it can hold quite a lot of (boolean) information. Passing it around and copying it is fast and it can easily live on the stack. For low-level devs, bit-fields are a common appearance and people know how to handle them.
Since I cannot argue for PHP, because I have not used it for a long time, let me discuss JS at least. JavaScript is, by all means, not low-level. Everything lives on the heap and it is good practice to do (slow) string comparisons in order to have very verbose source code, which is easy on the eyes of developers and designers alike. It is a web language, made for animation and visual improvement, not performance.
Hence I think, even though in OpenSSL, a C-based library, bit-fields are a common thing, in JavaScript, using objects which hold the options in a verbose way is more natural. I'd actually argue that in JS, you don't get a big performance or memory benefit from bit-fields, and they might feel alien to other JS devs.
That's why
/** * Holds options * @type {Object.<string, boolean>} */ const options = { opt1: true, opt2: false, opt3: true, };would be my preferred way to go. Try to get performance from algorithms in JS, not from memory layouts. JS engines usually are optimized to handle objects anyway. If you really need that kind of optimization, go for WebAssembly or (in case of NodeJS) FFI bindings.