aqp
aqp copied to clipboard
aqp 2.0 function/method/argument naming conventions
The laissez-faire approach to the growth of aqp over the last decade has allowed a proliferation of a variety of different naming conventions. I believe this has overall been good for the development of the package -- as the focus has been on delivering tools to the users -- and perhaps not getting too caught up in formalities.
However, I believe all developers, and probably all users, of aqp have to some degree become frustrated with what they may percieve as "inconsistencies."
I recognize inconsistencies but also recognize that actual "standard" up to this point is an illusion -- at least with respect to aqp.
This issue is for discussion of suggestions to improvements to the aqp function/method and argument naming scheme.
I feel like I could suggest as a couple goals to constrain and guide our work. Feel free to add or modify these.
-
relatively "uniform" syntax that reflects a cohesive, well-thought out package
-
minimal number of keystrokes, while still providing essential context as to what a function does
-
function names should be verbs -- they "do" something
-
arguments that refer to the same data for similar functions should do similar things, be ordered and named the same -- there are a couple really nasty inconsistencies like
p
for profile orp
forpattern
I personally am one of the worst offenders in terms of the variety of functions I have introduced and the variety of naming conventions (I use dots, snake case, camel case, every verb, no verb, etc).
Even though many of my functions have been in place a while, I recognize that people are only just now getting to "know" them -- and now have suggestions. I have always been open to renaming them -- but have generally stated that I would not do so until we decide what exactly our naming "target" is.
So that is what this issue is for.
See this file for some investigations I did into the functions used in aqp. https://github.com/ncss-tech/aqp/blob/aqpdf/misc/aqp2/function-dependencies.R
The below groups are general groups -- assigned using a sequence of regular expressions. It isn't perfect, but it is perhaps a starting point.
$plot
[1] "addBracket, addDiagnosticBracket, addVolumeFraction, colorContrastPlot, explainPlotSPC, findOverlap, fixOverlap, groupedProfilePlot, make.segments, plot, plot_distance_graph, plotColorQuantiles, plotMultipleSPC, plotSPC"
$color
[1] "aggregateColor, barron.torrent.redness.LAB, buntley.westin.index, colorContrast, colorQuantiles, contrastChart, contrastClass, getClosestMunsellChip, harden.melanization, harden.rubification, hasDarkColors, horizonColorIndices, huePosition, hurst.redness, munsell2rgb, parseMunsell, previewColors, rgb2munsell, soilColorSignature, soilPalette, thompson.bell.darkness"
$texture
[1] "hztexclname, hztexclname<-, texture.triangle.low.rv.high, textureTriangleSummary"
$spccalc
[1] "argillic.clay.increase.depth, crit.clay.argillic, estimatePSCS, estimateSoilDepth, get.increase.depths, get.increase.matrix, get.ml.hz, get.slice, getArgillicBounds, getCambicBounds, getMineralSoilSurfaceDepth, getPlowLayerDepth, getSoilDepthClass, getSurfaceHorizonDepth, mollic.thickness.requirement, sim"
$spc
[1] "aqp.env, aqp_df_class, checkHzDepthLogic, checkSPC, compositeSPC, coordinates, coordinates<-, denormalize, depth_units<-, depthOf, depths<-, diagnostic_hz, diagnostic_hz<-, filter, generalize.hz, genhzTableToAdjMat, glom, glomApply, grepSPC, guessGenHzLevels, guessHzAttrName, guessHzDesgnName, guessHzTexClName, horizonDepths, horizonDepths<-, horizonNames, horizonNames<-, horizons, horizons<-, hzDepthTests, hzDesgn, hzdesgnname, hzdesgnname<-, hzDistinctnessCodeToOffset, hzID, hzID<-, hzidname, hzidname<-, hzTransitionProbabilities, idname, lunique, maxDepthOf, metadata, metadata<-, minDepthOf, missingDataGrid, mostLikelyHzSequence, mutate, mutate_profile, nrow, pc.SPC, permute_profile, profile_compare, profile_id, profile_id<-, profileApply, profileGroupLabels, proj4string, proj4string<-, random_profile, rbind.SoilProfileCollection, rebuildSPC, reorderHorizons, replaceHorizons<-, restrictions, restrictions<-, site, site<-, siteNames, siteNames<-, SoilProfileCollection, spc_in_sync, split, subApply, subsetProfiles, test_hz_logic, union, unique, unroll, validSpatialData"
$spcagg
[1] "aggregateSoilDepth, evalGenHZ, evalMissingData, genSlabLabels, group_by, pc, slab, slice, slice.fast, summarize"
$xrd
[1] "f.noise, resample.twotheta"
$classification
[1] "brierScore, confusionIndex, shannonEntropy, summaryTauW, tauW, xtableTauW"
Thanks for putting all of this together. aqp
and related packages are now sufficiently complex and widely-used that we need some consistency. Function naming is the tip of the iceberg. My thoughts on the matter:
- Suggest optimizing for intuition ("estimate a thing") and accurate code-completion (
estimate...
) vs. pure style adherance. - I prefer
camelCase
for high level functions, and care less about low level functions. - The
.
or_
are handy for complex (usually low level) functions that aren't easily described by 1-2 words. - Too many
_
are annoying to type. - mixing style is fine by me as long as there is a reason (
NASISSomething
is confusing,NASIS_someThing
less so) - Overly-simplistic, cutesy function names are annoying and counter-intuitive.
- Generic functions like
union
andfilter
are usually a good choice, but not always. For example, I do not think that any function performing aggregation should be re-named toaggregate
. - Some functions need simpler names (
texture.triangle.low.rv.high
is terrible name). - Some functions need to be removed in favor of modern replacements (
unroll
->slice
)
Since aqp
is about 50% data manipulation and 50% analysis, I think that we should could use some informative prefixes:
-
get
: "extract from" or "retrieve" (more specificity likely required as it means "fetch" insoilDB
) -
check
: validity -
estimate
: much of the analysis is really estimation -
plot
: plot a thing -
add
: annotate a plot of sketches
These are useful guidelines (hey a good use for the wiki) but should not be absolute. We don't have time to play style police. Lets all try and be constructive when making suggestions about others' code.