Hacker Timesnew | past | comments | ask | show | jobs | submit | fultonn's commentslogin

> Will this have reach and teeth though?

It'll have reach because MA has a long-arm statute and there's a rich history of applying that statute in the context of Chapter 93.

It'll have teeth but probably not to the effect that you hope.

This statute was written such that only the Attorney General can bring action; see Section 10(b). This diverges from a long history in the Commonwealth of allowing private individuals to bring civil suits for most types of Chapter 93 violations.

As a result, I anticipate that the most impactful change will be in the quantity and frequency of political donations to Mass AG candidates (and in the case of contested primaries their aligned block of candidates up and down ticket).

Consumer protection laws should always provide for a private cause of action. Otherwise they just function as a mechanism for legalized corruption.


I don't disagree with the thrust of your criticism of the dynamic (especially long term). But there is a legitimate concern that the first test cases to hit the courts need to be quite unsympathetic egregious violators rather than surveillance dynamics that have been thoroughly normalized for decades. If people start bringing private suits against neighbors that have deployed Amazon surveillance cameras, "credit bureaus", private investigators, big tech surveillance companies directly (eg Google, and especially with weak legal arguments), it is likely to set some poor precedents and create political pushback.

Section 2 already limits applicability to persons collecting or processing data on not less than 60,000 consumers, so suits brought against neighbors would be (rightfully) dismissed.

The concern about poor precedent stemming from poor cases has some rational sense, but we have the benefit of experience. Empirically it just hasn't tended to play out like that in the case of consumer protection statutes in MA. One reason this doesn't happen in practice might be the limited bandwidth of the appellate process. The SJC could (and likely would) prioritize answering questions about the statute in the context of cases brought by the AG.

The longevity pro-consumer laws in MA provides some good empirical data that cuts against the concern about push-back.


I'll admit my examples were pretty weak.

What I see is this bill, while a fantastic development, is still just addressing the tip of an iceberg on an industry that has been festering for many decades now (I mean, the "Fair" Credit Reporting Act - aka regulatory capture by the early digital surveillance industry - was passed in 1970). So "pushback" doesn't necessarily mean this law being undone, but rather it ending up as the full amount of privacy we can expect rather than first step of a hopeful trend.

For example look at how many more rights the GDPR grants. If a GDPR-analog were on the table in the US, the entire surveillance industry would balk. And these days the surveillance industry is basically the bulk of our "economy" (ie stock market valuations). And given the way "our" government works, I wouldn't be terribly hopeful about the individual liberty side prevailing over entrenched interests. Which is why I'm making an argument for more of a gradual shift.

Now having said that, perhaps it makes more sense for each bit of legislation to bite off fewer rights (as I'd say this legislation does), while including a private right of action so that the rights it does grant are maximally enforced. Having glaring violations of the law-as-written just sit there unaddressed is certainly its own powerful momentum-killer.


Couldn't this be mitigated by, say, having the private right of action not start until a few years into the applicability of the law?

IBM Research (https://research.ibm.com) | Boston, MA | Full-Time | Hybrid

We are hiring a Senior Scientist to join our team. We're working at the intersection of LLM training and the inference stack.

We're looking for someone who wants to work in the following intersection:

* Both implementation experience and theoretical background in compilers, programming languages, or programming models. We have an absolutely requirement for someone with mathematical maturity in formal systems (programming language theory, type theory, proof assistants, etc.)

* practical experience pretraining and rl tuning LLMs (or skill-adjacent experience / expertise)

* practical experience developing or modifying inference engines for LLMs

From an academic perspective, think "POPL/PLDI/OOPSLA/CAV/etc." + "NeurIPS/ICML/AISTATS/etc.".

The job posting for a Senior Scientist position, with rough salary ranges, will go live soon. I'll edit or post a reply with the link when it's live. For now, if you have expertise at some intersection of the above topics, please feel free to reach out: nathan@ibm.com

Application link for Research Scientist position: https://careers.ibm.com/en_US/careers/JobDetail?jobId=118292...

We also have a Research Engineering role for which we're actively interviewing: https://careers.ibm.com/en_US/careers/JobDetail?jobId=113488...


> We also have a Research Engineering role for which we're actively interviewing: https://careers.ibm.com/en_US/careers/JobDetail?jobId=113488...

This page says the job is closed.

Also, not sure if my email response to your last post made it through or not.


Thanks for letting me know.

Got your email. Let's connect there.


This bolsters OP's point.

It's the same as calling a gun a "powerful hole puncher".

There is a reasonable objection that a gun is such a powerful hole puncher that it is not merely a hole puncher. But the clear implication of that objection is that the user of the tool now has more responsibility and that the tool should be treated with more respect/care.

LLMs are a tool. The impact of using that tool is the responsibility of the end-user. As the tool at hand becomes more powerful, the care with which the end-user should treat that tool increases.

For some reason, with LLM-based systems, we seem to be going the opposite direction. As the tool becomes more capable people absolve themselves and others of more responsibility. This feels backwards to me.

(Aside: in a lot of ways, at least form a scientific and engineering perspective, modeling LLMs as "fundamentally auto-complete" is an incomplete theoretical model but one from which we can still get a lot of mileage.)


I've considered there's probably no ethical way to use contemporary AI when it is "out in front" doing anything of consequence. Your "AI is a tool and nothing more" frames ethical use of the technology for me.

And even then, there are such copyright issues with it. Is there no practical ethical use for AI? Responsible use doesn't equate with ethical use for me.


> there's probably no ethical way to use contemporary AI when it is "out in front" doing anything of consequence. Your "AI is a tool and nothing more" frames ethical use of the technology for me.

I've thought a lot about how to safely deploy autonomous systems (even did a whole PhD on the topic, lol).

I think one can ethically deploy a system that has some degree autonomy. It takes a lot of work to do right. And the tooling for LLM-based systems isn't quite as mature as the tooling for e.g. control systems. Part of this is because so many resources in AI safety are misspent on problem statements that are myopic or grandiose. Between "don't say pii" and "prevent ASI extinction" there's a hard but tractable control systems-y view of AI safety.

But I don't think there is any sort of fundamental barrier that prevents us from building appropriately constrained LLM-based systems.

> And even then, there are such copyright issues with it. Is there no practical ethical use for AI? Responsible use doesn't equate with ethical use for me.

When responding to a position, especially on the internet, I try to empathize with the thing I'm responding to. Not just understand it, but sort of put myself in a mental state where I have an emotional attachment to my conversation partner's point of view.

With respect to Copyright as a legal framework in my country (USA): despite my best attempts, I really struggle to develop empathy for the viewpoint that LLMs/diffusion models are not a transformative use. I can certainly sympathize, but trying to actually put myself in the shoes of believing that training an LLM is a purely derivative and non-transformational work just feels far too alien. There are so many things that are "clearly transformative" but required so many orders of magnitude less scientific/technical/engineering genius.

Which isn't to say that the US legal system's definition of copyright is the morally correct one.With respect to copyright beyond the US legal system, or beyond legal denotations generally: I can certainly empathize.


> But I don't think there is any sort of fundamental barrier that prevents us from building appropriately constrained LLM-based systems.

This iteration of the tech, I agree. In future iterations that use intensive persuasion techniques, who can say?

> Which isn't to say that the US legal system's definition of copyright is the morally correct one.

The US legal system's definition of copyright is the morally correct one, though, because it is codified law. Immoral laws eventually get overturned, but until then it is the rule because the collective we says so right now.

What is the derivative work of an AI response? Who is the creator making its derivative works? The AI is not an entity, it is a software engine operating over an obfuscated index.

Beyond the muddiness of copyright, there is the question of human flourishing. How the heck would you train children and adolescents on the responsible use of AI?

The current UX, the "friend computer"-themed REPL, is chock-a-block with moral hazards. Loss of privacy and profiling, fostering undue trust, emotional dependence and manipulation. Like, I get that you're invested in the industry, but we should condemn this tech.


> What is the derivative work of an AI response? Who is the creator making its derivative works? The AI is not an entity, it is a software engine operating over an obfuscated index.

I was not talking about the output of models.

I'm referring to the model itself. The `.ckpt` file is clearly transformative wrt its training set. Or, at least, substantially more transformative than other things that have long received fair use protection.

> Like, I get that you're invested in the industry

On the contrary, I'm invested quite heavily in the exactly opposite hypothesis -- that the ChatGPT/Claude/Gemini UX you're referring to is not fit-for-purpose.

> How the heck would you train children and adolescents on the responsible use of AI?

By teaching them how it works, how it doesn't work, and to think of it as a unit of computation rather than an anthropomorphic entity.


> I'm referring to the model itself. The `.ckpt` file is clearly transformative wrt its training set. Or, at least, substantially more transformative than other things that have long received fair use protection.

Oh, I see. And the model weights are what one can make the copyright infringement claims on in the US?

Not to split hairs, but do you believe it's so transformative because you can't read plain text copies of original works in the weights or because the source material is so hopelessly discombobulated that the original work could not be reliably recreated?

I believe the 'hopelessly discombobulated' argument is probably pretty solid, but one could argue to a judge that the weights are something like JPEG compression. Sure the forged image of Mona Lisa is a bit foggy in the background and some of those details are incorrect, but the wry smile in the foreground is perfectly captured.

> On the contrary, I'm invested quite heavily in the exactly opposite hypothesis -- that the ChatGPT/Claude/Gemini UX you're referring to is not fit-for-purpose.

Oh! Excellent, carry on!

> rather than an anthropomorphic entity.

But it unfailingly passes the Turing test, at least with regards to an immature, non-discerning human mind like a child's. You may as well rub a lamp.


IBM Research | Boston, MA, USA (hybrid - 3 days in office; location flexibility - Boston preferred but NYC/SF possible) | Research Scientists and Research Engineers

Our team thinks there's a lot of value in co-design of software harnesses and LLMs, particularly for small/medium models, particularly in the open model space.

Our colleagues across the isle do an awesome job at training the Granite model series, so we (physically and organizationally) sit in a uniquely good place to do impactful work in this space.

I am currently looking for early career scientists and engineers who are interested in LLMs and also one of {programming languages, formal methods, compilers}. Experience with Rust is a plus. Cool systems-y projects with formalizations on paper or in rocq/lean/etc also a plus. Neither is necessary, per se.

If this sounds interesting please send over an email: nathan@ibm.com


what I've done for a similar script in the past:

    answer_initial = llm(prompt=prompt, site=site) # JSON with answer and any stuff needed to do heuristic checks.
    heuristic_results = heuristics(answer_final) # rule based.
    answer_final = llm(prompt-prompt, site=site, answer=answer_initial)
    mark_for_review = ... # basically just a bunch of hard-coded stuff I add flag possible failures for review.

You can use an extremely small/cheap model for something like this -- granite 4.0 micro works fine for me, 3.3 8b did as well, both run on my macbook. YMMV / try different models and see how it goes.


> Is there a way to load this into Ollama?

Yes, the granite 4 models are on ollama:

https://ollama.com/library/granite4

> but my interest is specifically in privacy respecting LLMs -- my goal is to run the most powerful one I can on my personal machine

The HF Spaces demo for granite 4 nano does run on your local machine, using Transformers.js and ONNX. After downloading the model weights you can disconnect from the internet and things should still work. It's all happening in your browser, locally.

Of course ollama is preferable for your own dev environment. But ONNX and transformers.js is amazingly useful for edge deployment and easily sharing things with non-technical users. When I want to bundle up a little demo for something I typically just do that instead of the old way I did things (bundle it all up on a server and eat the inference cost).


Thanks for this pointer and explanation, I appreciate it.

Also my "dev enviornment" is vi -- I come from infosec (so basically a glorified sysadmin) so I'm mostly making little bash and python scripts, so I'm learning a lot of new things about software engineering as I explore this space :-)

Edit: Hey which of the models on that page were you referring to? I'm grabbing one now that's apparently double digit GB? Or were you saying they're not CPU/ram intensive but still a bit big?


> Edit: Hey which of the models on that page were you referring to?

I was referring to the smaller ones -- `granite4:micro`, `granite4-latest`, `granite4:350m`.

> I'm grabbing one now that's apparently double digit GB?

You are probably downloading one of these two ids: `granite4:small-h` or `granite4:32b-a9b-h`.

The "small" model _is_ small in relative terms, but is also the largest of the currently released granite models! At 32B parameters (19GB download) it's runnable locally but not in the same "run on your laptop with acceptable performance" category of the nano/micro models.

> Also my "dev enviornment" is vi -- I come from infosec (so basically a glorified sysadmin) so I'm mostly making little bash and python scripts, so I'm learning a lot of new things about software engineering as I explore this space :-)

Shameless plug: if you're writing Python scripts to automate things using small locally hosted models, consider trying out https://github.com/generative-computing/mellea

Mellea tries to nudge toward good software engineering practices -- breaking down big tasks into smaller parts, checking outputs after nondeterministic steps, thinking in terms of data structures and invariants rather than flow charts, etc. We built it with "actual fully automated robust workflows" in mind. You can use it with big models or small models, but it really shines when used with small models.


Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: