Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Composable scalar transforms #128

Open
wants to merge 41 commits into
base: master
Choose a base branch
from
Open

Conversation

Ickaser
Copy link

@Ickaser Ickaser commented Mar 6, 2025

This is very much still WIP, but took a stab at constructing something like discussed in #95 and #126 . As of yet, I haven't even tested that this runs, much less correctly; will come back when I have more time.

One major question: for things like mapping to negative reals, I am imagining Scale(-1) ∘ Exp(). In this case, should the log-Jacobian of Scale be log(abs(scale))?

@Ickaser
Copy link
Author

Ickaser commented Mar 14, 2025

I've added some scalar consistency tests now, which pass, with this new API. If there are some other obvious tests, I am open to suggestions. Still missing is any treatment of non-Real number types, as would be necessary for Unitful, but if you want to merge this and get feedback first before tackling the Unitful-friendly part this might be ready for something like that.

New questions

  • Namespace: to avoid name collisions, I decided to rename Exp, Scale, and Shift to TVExp, TVScale, and TVShift. That way they can be exported and maintain a shorter name than TransformVariables.Exp. I'm of course open to other prefixes or ways to keep it readable.
  • Should unit transforming use the Scale operator (so allow Scale to take non-Real types), or give it its own separate transformation (which I guess could accommodate other kinds of exotic number types)?

@Ickaser
Copy link
Author

Ickaser commented Mar 14, 2025

Some other ideas I had but which haven't tried out yet (note to self as much as anything else)

  • We could try replacing the existing ShiftedExp and ScaledShiftedLogistic with these elemental transforms, but there may be some error/sanity checks we miss out on because a CompositeScalarTransform doesn't yet have end-to-end error checking like these manually-written ones do.
  • I haven't yet tried composing prior transformations with the new ones. If I'm lucky it will just work
  • It may make sense to introduce a TVNeg() transformation for a simple sign change, rather than using e.g. TVScale(-1) which encodes a number type in the -1.

@Ickaser
Copy link
Author

Ickaser commented Mar 20, 2025

I decided to add a TVNeg transformation, which if I am understanding correctly has a log-Jacobian of 0, and enforce that TVScale takes only positive numbers. Some tests are also in place and pass on my machine. I haven't gone so far as to replace existing transforms with the refactored form, which would put the new version through the full test suite.

I haven't thought deeply about what this current implementation will do with real number types, e.g. Float32, either in parameters or inputs.

As far as using this to introduce Unitful support, I see two main options:

  1. Create a Unitful extension with a single scalar transform, e.g. TVUnit(u::Unitful.Units). This transform would construct a Quantity in the forward direction, and apply ustrip for the inverse transform. This approach would have to be custom-implemented for each exotic number types (e.g. units provided by DynamicQuantities), but I think there would be a natural way to do so for any given type.
  2. Generalize the TVScale transform to allow any object; in this case, you multiply by units in the forward direction, then the inverse divides by the units. In the case of Unitful, this should produce the right types on the inverse. For other number types, multiplying should yield new types with the help of promotion, but the inverse would not be so simple.

In case 1, new number types each need their own extension. In case 2, inverse transforms will not yield original types.

@tpapp
Copy link
Owner

tpapp commented Mar 20, 2025

I think this is shaping up nicely, thanks for the work! Sorry I have been very busy with a project and failed to provide any feedback so far, I will try to make up for it now.

I am imagining Scale(-1) ∘ Exp(). In this case, should the log-Jacobian of Scale be log(abs(scale))?

Chaining just multiplies the Jacobians, so abs(log(...)) Jacobians are just added up. It should be something like log(abs(scale)) + x, I think your implementation is correct.

In case 2, inverse transforms will not yield original types.

I think that is OK, I think we should just clarify the API for inverse. In particular, the best the user can expect is a vector of equal length, approximately equal elementwise. When this PR is done I will make that clear.

@Ickaser
Copy link
Author

Ickaser commented Mar 25, 2025

I went ahead and tried out replacing the ShiftedExp and ScaledShiftedLogistic with the corresponding refactored transformations, and by and large the refactor works with the existing test suite. Exceptions that I marked with @test_broken:

  • as(Real, 1, 4.0) == as(Real, 1.0, 4.0)
    • The new transforms each hold their own number types (e.g. TVShift{Int64} vs TVShift{Float64}. Since my goal is to eventually accommodate new (unitful) number types, I don't know whether to continue to enforce this. I can define something like ==(TVShift(a), TVShift(b)) = (a==b) and run with that
  • Scalar show tests.
    • This is a valid question: I decided to show a CompositeScalarTransform by printing a valid expression (TVShift(5) ∘ TVExp()); would you prefer to represent it differently? Should there be special dispatches for e.g. asℝ₋, which is TVNeg() ∘ TVExp(), like there were for ShiftedExp and ShiftedScaledLogistic? In particular, if we have a special show for asℝ₊ == TVExp(), then we get results like TVExp() ∘ TVScale(2) printed like asℝ₊ ∘ TVScale(2), which I think I don't like.

@Ickaser
Copy link
Author

Ickaser commented Mar 25, 2025

In case 2, inverse transforms will not yield original types.

I think that is OK, I think we should just clarify the API for inverse. In particular, the best the user can expect is a vector of equal length, approximately equal elementwise. When this PR is done I will make that clear.

I guess if we take this approach and have a TVScale{T} with T that can take any type, it is always possible to make an extension with dispatches like transform(t::TVScale{T}, x) where T <: DynamicQuantities.Quantity. So the two options are not mutually exclusive, and option 2 provides a hopefully-sane fallback anyway.

@Ickaser
Copy link
Author

Ickaser commented Mar 26, 2025

Okay, tried widening types, etc. to accomodate Unitful types. For now, that means that transform accepts Reals, so non-Real numbers have to show up only in the outermost transform; inverse accepts all Numbers, of which Unitful.Quantity is a subtype.

A quick check suggests that transform and inverse work fine, but transform_and_logjac does not. Frankly, I have no idea how to treat the log-Jacobian for scaling by a unit--the Jacobian in that case should have units, I think, which would mean the log-Jacobian would require taking the log of a dimensioned quantity (which yields a Julia error, but is also not well-defined).

Should that be explicitly documented as not working for dimensional quantities? Is there a different way to treat the log-Jacobian?

@tpapp
Copy link
Owner

tpapp commented Mar 28, 2025

Frankly, I have no idea how to treat the log-Jacobian for scaling by a unit--the Jacobian in that case should have units

I don't think it makes sense, so it should error. Specifically, we should document that transform_and_logjac is not supported for transformations that do not result in <:Real numbers and should error in this case.

@Ickaser
Copy link
Author

Ickaser commented Mar 28, 2025

Okay, I've implemented it that way and added a note to the docs.

I have inverse_and_logjac defined for all of the individual scalar transforms, as well as the composite; should I move that from its own WIP place in the test suite to get tested with the rest of the scalar consistency tests?

With that, I'm not thinking of anything else that's missing from this implementation. Is there anything still on your list? Does it need more documentation or anything? Should we delete the ShiftedExp and ScaledShiftedLogistic, now that they have been superceded by this implementation?
(Note to self: need tests for the show methods of scalar transforms.)

@Ickaser
Copy link
Author

Ickaser commented Apr 2, 2025

I went ahead and added an alias TransformVariables.compose for Base.:∘, because Unicode is still sometimes annoying to type. Let me know if you don't like that option.

I've reworked the scalar show behavior so that all the old show tests still work. With that, I don't think we're breaking any API, just adding new features.

I think everything on my list is working at this point. Anything else you want to see in this?

@Ickaser
Copy link
Author

Ickaser commented Apr 2, 2025

(Remaining question: should we delete the ShiftedExp and ScaledShiftedLogistic, since they are not used in any public API nor in any of the internal constructions?)

@tpapp
Copy link
Owner

tpapp commented Apr 3, 2025

remaining question: should we delete the ShiftedExp and ScaledShiftedLogistic, since they are not used in any public API nor in any of the internal constructions?

  1. we should benchmark first, to see if they are equally effective,
  2. if yes, deprecate current ones in a minor release,
  3. then remove in the next major release.

I think everything on my list is working at this point. Anything else you want to see in this?

Sorry, I had computer issues and I am still catching up with FOSS work, will try to review the whole thing ASAP.

@Ickaser
Copy link
Author

Ickaser commented Apr 4, 2025

remaining question: should we delete the ShiftedExp and ScaledShiftedLogistic, since they are not used in any public API nor in any of the internal constructions?

  1. we should benchmark first, to see if they are equally effective,

  2. if yes, deprecate current ones in a minor release,

  3. then remove in the next major release.

  1. Benchmark in terms of performance (run time), I take it? I will try to put together a little script for that purpose.
    To make sure I understand 2 and 3, we will:
  2. In the minor release, make available the new composite transform functionality, but have as(Real, a, b) construct a ShiftedExp or ScaledShiftedLogistic as before
  3. In the major release, have as(Real, a, b) use the composite transforms instead

This PR already makes as(Real, a, b) use the composite transforms, so I will have to move some of the current unit tests and changes to a new PR so it can get applied for a major release. But this makes sense to me as a deprecation path, so (if I have understood the idea correctly) I am willing to do that.

Copy link
Owner

@tpapp tpapp left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for all the fixes, this is shaping up nicely so just minor comments. If benchmarks show no regression, this is good to go.

@tpapp
Copy link
Owner

tpapp commented Apr 8, 2025

make available the new composite transform functionality, but have as(Real, a, b) construct a ShiftedExp or ScaledShiftedLogistic as before

Sorry, I take it back, I forgot that we never exported ShiftedExp etc, so no need to deprecate.

If the benchmarks show no regression, I think this is good to go.

@Ickaser
Copy link
Author

Ickaser commented Apr 8, 2025

Pretty simple benchmark, let me know if this is incomplete (first time really investigating something like this):

using TransformVariables
using BenchmarkTools
const TV = TransformVariables
btrans(t) = inverse(t, transform(t, rand()))
blogjac(t) = transform_and_logjac(t, rand())
t1 = TV.ShiftedExp(true, 4.0)
t2 = as(Real, 4.0, ∞)
@benchmark btrans(t1)
@benchmark btrans(t2)
BenchmarkTools.Trial: 10000 samples with 991 evaluations per sample.
 Range (min … max):  38.143 ns …  7.402 μs  ┊ GC (min … max): 0.00% … 98.79%
 Time  (median):     40.666 ns              ┊ GC (median):    0.00%
 Time  (mean ± σ):   47.260 ns ± 83.987 ns  ┊ GC (mean ± σ):  2.86% ±  1.94%

  █▄▆▅▄▃▂▂▁  ▁▁                                               ▁
  █████████▇▇█████▇▆▅▇▇▆▇▇▇▇▇█▆▇▆▇▇▇▆▆▇▇▇▇▆▆▆▆▄▄▆▅▅▅▄▆▅▅▅▅▂▄▄ █
  38.1 ns      Histogram: log(frequency) by time       123 ns <

 Memory estimate: 16 bytes, allocs estimate: 1.

BenchmarkTools.Trial: 10000 samples with 993 evaluations per sample.
 Range (min … max):  36.556 ns …  7.574 μs  ┊ GC (min … max): 0.00% … 99.01%
 Time  (median):     40.282 ns              ┊ GC (median):    0.00%
 Time  (mean ± σ):   47.887 ns ± 84.888 ns  ┊ GC (mean ± σ):  2.73% ±  1.94%

  █▄▇▄▃▃▂▁▁▂▁▁                                                ▁
  ███████████████▇▇▇▇█▇▇▇██▇▇██▇▇▇▇▆▆▆▅▆▆▆▅▆▆▆▆▇▅▄▄▅▄▆▅▄▅▅▃▄▄ █
  36.6 ns      Histogram: log(frequency) by time       145 ns <

 Memory estimate: 16 bytes, allocs estimate: 1.

The other benchmarks

@benchmark blogjac(t1)
@benchmark blogjac(t2)

t3 = TV.ShiftedExp(false, 4.0)
t4 = as(Real, -∞, 4.0)
@benchmark btrans(t3)
@benchmark btrans(t4)

t5 = TV.ScaledShiftedLogistic(2.0, 2.0)
t6 = as(Real, 2.0, 4.0)
@benchmark btrans(t5)
@benchmark btrans(t6)

all look basically the same, the finite interval takes slightly longer than an exp (as expected) but the two implementations are indistinguishable.

While I'm thinking about performance, it bears mentioning that, in the test suite, the tests where I compose lots of arbitrary transforms take a lot of time to compile. For example,

@time begin
    all_transforms = [TVShift(3.0), TVScale(2.0), TVExp(), TVLogistic(), TVNeg(), 
        TV.ScaledShiftedLogistic(2.0, 1.0), TV.ShiftedExp(true, -5), TV.ShiftedExp(false, 4.5)]
    for t1 in all_transforms, t2 in all_transforms, t3 in all_transforms
        t = t1  t2  t3
        t isa TransformVariables.CompositeScalarTransform{Tuple{typeof(t1), typeof(t2), typeof(t3)}}
    end
end

takes 5 seconds to compile the first time, then something like a millisecond to actually run. I don't know if that's a quirk in how I constructed the test. The tests still only take a couple minutes to run.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants