Celia Cintas

Localizing Persona Representations in LLMs

Celia Cintas — Mon, 08 Sep 2025 10:26:05 +0000

Introduction

Understanding the mechanisms by which large language models (LLMs) process information, store knowledge, and generate outputs remain key open questions in research [0, 1].

A persona is a natural language portrayal of an imagined individual belonging to some demographic group or reflecting certain personality traits [2]. Personas are often used to define the personality or perspective the LLM model should adopt when interacting with users, e.g., by prompting “Suppose you are a person who …” followed by a description of a particular trait or belief. This can significantly influence language generation by setting

a tone appropriate for the context (e.g., empathetic or professional) and by affecting behavior and reasoning capabilities. Personas can enhance user experience and engagement by making models more relatable and context-aware, and can improve generated output. Personas have also attracted increasing attention, particularly in the development of trustworthy models [3, 4].

Previous research has demonstrated that personas may elicit toxic responses and perpetuate stereotypes in language models [5], and can produce extreme political or cultural views [6]. Moreover, personas have been (mis)used to circumvent built-in safety mechanisms by instructing models to adopt specific roles [7]. Understanding how LLMs encode personas is essential for harm mitigation methods, aligning models with diverse beliefs, and tailoring outputs to users’ preferences.

In this work, we feed statements associated with different personas into various LLMs and extract their internal representations (i.e., activation vectors). We then analyze these representations to address the following two questions:

Where in the model are persona representations encoded? Specifically, which layers in the LLM exhibit the strongest signals for encoding persona-specific information?

How do these representations vary across different personas? In particular, are there consistent, uniquely activated locations within a given LLM layer where distinct persona representations are encoded?

Identifying Layers With Strongest Persona Representations

We first study which layers provide the strongest signals for encoding personas for different LLMs. Specifically, we identify the layer that exhibits the greatest divergence between the principal components (PCs) of the last token representations for sentences corresponding to a given persona. Our findings lay the groundwork for our next step, where we seek to localize sets of activations within a layer encoding persona information. Across the models, the largest distances are found in the later layers (20–31). The table reports additional metrics evaluating the separation, overlap, and compactness of the groups. Most measures indicate that the final layer of Llama3-8B-Instruct achieves the strongest separation. We find, however, that for some personas, certain metrics favor earlier layers or other models.

This suggests that while Llama3-8B-Instruct generally provides the best overall separation, for persona-specific applications, evaluating different metrics and models might be beneficial. Overall, later layers exhibit the greatest separation across LLMs, indicating that persona representations become increasingly refined, with final layers encoding the most discriminative features.

Identifying a Layer’s Activations With Strongest Persona Representations

Next, we investigate whether distinct, consistent activation groups within a layer encode different personas. Building on our previous findings, we compare the last token representations from Llama3-8B-Instruct. We use Deep Scan [8,9, 10, 11] to identify the activation subsets most indicative of persona-specific information, which we refer to as salient activations.Before looking at subsets of salient activations, we need to validate the Deep Scan results across different levels of granularity. Which shows consistent performance for Level 2 and Level 0. That is why the following results are within these two granularities.

Politics personas display much lower overlap, with only 9.42% (386) shared activations across all.

For Ethics personas, only a small fraction of activations are unique—ranging from 0.37% (15 activations) to 1.39% (57).

Significance

In this study, we analyze last token activations from 3 families of decoder-only LLMs using persona-specific statements from 14 datasets across Politics, Ethics, and Personality topics. We showed the strongest signal in separating persona information in final third of layers. Results suggested that political views have distinctly localized activations in the last layer of Llama3, and ethical values show greater polysemantic overlap.

Limitations & Future Work

While initial results, shed light into where representations are encoded. Our analyses are specific to the selected group of datasets and may not generalize well to other data sources. The datasets are written in English and primarily reflect WEIRD perspectives, and political views largely centered on U.S. politics. The dataset itself is LLM-generated, which has several shortcomings.

In the next coming months, we will explore a wider range of models, personas and datasets, and incorporate beliefs, values, and traits from more diverse cultural contexts. As well as investigate controlled editing of found internal representations.

Look out for our work @ AIES 2025 #119 Localizing Persona Representations in LLMs

Full paper with more details and experiments can be found in ArxIv and code in GitHub

The post Localizing Persona Representations in LLMs appeared first on Celia Cintas.

Spatially Constrained Search in Optical Flow Networks

Celia Cintas — Tue, 25 Jul 2023 09:00:50 +0000

Introduction

With the emergence of deep learning techniques and the availability of large-scale datasets, optical flow estimation performance has been significantly improved. However, recent work demonstrated that these networks are vulnerable to adversarial attacks like any other deep learning approach. Recent work [0,1,2] focuses on attacking optical flow networks with patches pasted onto the frames. These patch-based attacks cause occlusions on areas where the attack is placed, and motion boundaries, which degrades the performance of current optical flow estimators [3].

Self-driving cars use these models to estimate the motion of objects on the road and surroundings and use this information to decide on close-to-real-time actions/measurements. Attacking these networks means that the estimated motion could be completely wrong, affecting the decision process based on this information.

Animation from https://perceiving-systems.blog/en/post/the-road-to-safe-self-driving

What is Subset Scanning?

As our approach is based on Fast Generalized Subset Scanning [4], we wanted to do a quick review so easy to understand why the extensions provided in this paper are relevant to the patch-based attack problem. Subset scanning treats Neural Networks as data-generating systems and applies anomalous pattern detection methods to activation data. Subset Scanning efficiently searches over a large combinatorial space to find groups of records that differ the most from expected behavior. While existing works employed Subset Scanning to detect different types of OOD [5; 6; 7], we are the first to use it to (1) detect patch-based attacks, (2) on sequential images across temporal dimensions, (3) on networks for regression tasks.

Our assumption conducting our experiments is that activations from patch-based attacked samples have a different distribution of p-values than clean samples. A p-value for us is the proportion of activations drawn from the same node for multiple clean samples greater than the activation from the evaluation sample.

How do we score one sample at a given node? We use scoring functions on an new set of frames to measure how much of the values deviate from uniform. Although Subset Scanning can use parametric scoring functions (i.e., Gaussian, Poisson), the distribution of activations within particular layers is highly skewed and, in some cases, bi-modal. We use non-parametric Scan Statistics, a.k.a NPSS (checkout [4] for more deets), that make minimal assumptions on the underlying distribution of node activations and enable to scan across different types of layers.

Although NPSS provides a means to evaluate the anomalousness of a subset of activations for a given input, discovering which of the 2 to the J possible subsets provides the most evidence of an anomalous pattern, is computationally infeasible for large J, which is the case for the size of layers in deep learning models. That is why we need a priority function, in this case, the proportion of the values under a threshold. We have a guarantee that the subset maximizing the score will be one consisting only of the top-k highest priority records.

Spatially Constrained Search Space in Optical Flow Networks

Previously, the vanilla approach [8] forced inclusion of all channels naively implies that all channels in the inner layer being scanned are impacted by a patch attack on an optical flow neural network. The more realistic expectation is that the attack may affect only some subset of the channels. By introducing Spatial-Channel Optimization (SCO) features, we are returning a subset of spatial locations crossed with a subset of channels. While SCO allows us to find an anomalous subset of node activations across spatial locations, these detected anomalous locations may be far apart, spanning the entire frame of the given test image pairs. We want to further constrain the search space for subset scanning so that the detected anomalous locations strictly come from a local spatial neighborhood. Thus, we enforce a proximity constraint where our optimization is only applied to the kxk spatial region (the yellow region in the figure below).

This will ensure that the detected subset of anomalous locations only comes from this spatial neighborhood. We do this for all kxk locations in the given locations and return the subset of p-values from a kxk neighborhood that yields the highest NPSS score. Given the subset of detected p-values from a kxk neighborhood, we can easily localize the attack by finding which locations these detected p-values occur in the inner layer feature map. These detected locations in the feature maps can be up-sampled, if needed, to match the resolution of the input frames, or vice versa. The localized attacks can be utilized for mitigation or to simply mask the area for downstream flow estimation.

Experimental Setup

We validate our approach using four state-of-the-art flow estimators. We choose the first layer in each network’s components when selecting inner layers to apply our proposed method. Specifically, we select the first layer of the encoder and decoder module and their correlation layer.

Datasets and Attacks

Following previous papers[0,1,2], we use KITTI 2015, raw KITTI, MPI-Sintel, and raw Sintel datasets. KITTI consists of road scene images with sparse optical flow labels (2015) and without labels (raw). MPI-Sintel contains 23 sequences from computer-animation short Sintel with flow labels. Its raw frames without labels have also been used in previous un- or semi-supervised flow estimators. We constructed patch-based adversarial attacks on the four flow networks. The figure below shows the change in end-point error with and without the adversarial patch attack. We use patches of four different sizes with respect to the input image resolution. As expected, we see worse performances as we increase the patch attack sizes. In terms of flow networks, these patch attacks harm the performance of FlowNetC the most and RAFT the least.

Results

Overall, we see higher detection performance when we include spatial information, as when we lose spatial information, the method cannot detect the attacks across any model. This can be observed by looking at the score distributions of the clean (blue) and attacked (orange) test sets. The two distributions are more separable using SADL than the baseline. While our method shows clear improvements across multiple networks and datasets, both suffer on MPI-Sintel for FlowNetC and KITTI 2015 for PWCNet.

We can see an example visualization of a detected subset of anomalous locations in the feature space for the attacks of size p=153 for KITTI 2015 (top) and MPI-Sintel (bottom). We can successfully detect a subset of anomalous locations that align with the location of the patch attack.

Conclusion & Next Steps

We showcase how constrained search space can improve the detection and localization of patch-based adversarial attacks in optical flow estimators. We use a spatially constrained subset scanning on the inner layer in an unsupervised manner without any training or prior knowledge of the attacks. We further give insights into which layers are most affected by these attacks for various flow networks. The immediate next step could utilize detected and localized attacks to devise mitigation techniques for flow estimators. More details in this presentation.

Look out for our work @ IJCAI 2023

Wednesday, 23rd August at 11:45-12:45 Computer Vision (3/6) Session

#3357 Spatially Constrained Adversarial Attack Detection and Localization in the Representation Space of Optical Flow Networks. Hannah Kim; Celia Cintas; Girmaw Abebe Tadesse; Skyler Speakman.

References

[0] Ranjan, A., Janai, J., Geiger, A. and Black, M.J., 2019. Attacking optical flow. In Proceedings of the IEEE/CVF International Conference on Computer Vision (pp. 2404-2413).

[1] Schrodi, S., Saikia, T. and Brox, T., 2021. What causes optical flow networks to be vulnerable to physical adversarial attacks. arXiv preprint arXiv:2103.16255, 3.

[2] Schrodi, S., Saikia, T. and Brox, T., 2022. Towards understanding adversarial robustness of optical flow networks. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (pp. 8916-8924).

[3] Kim, H.H., Yu, S. and Tomasi, C., 2021. Joint detection of motion boundaries and occlusions. The British Machine Vision Conference (BMVC), 2021.

[4] McFowland, E., Speakman, S. and Neill, D.B., 2013. Fast generalized subset scan for anomalous pattern detection. The Journal of Machine Learning Research, 14(1), pp.1533-1561.

[5] Cintas, C., Das, P., Quanz, B., Tadesse, G.A., Speakman, S. and Chen, P.Y., 2022. Towards creativity characterization of generative models via group-based subset scanning. IJCAI 2022.

[6] Kim, H., Tadesse, G.A., Cintas, C., Speakman, S. and Varshney, K., 2022, March. Out-of-distribution detection in dermatology using input perturbation and subset scanning. In 2022 IEEE 19th International Symposium on Biomedical Imaging (ISBI) (pp. 1-4). IEEE.

[7] Akinwande, V., Cintas, C., Speakman, S. and Sridharan, S., 2020. Identifying audio adversarial examples via anomalous pattern detection. arXiv preprint arXiv:2002.05463.

[8] Cintas, C., Speakman, S., Akinwande, V., Ogallo, W., Weldemariam, K., Sridharan, S. and McFowland, E., 2021, January. Detecting adversarial attacks via subset scanning of autoencoder activations and reconstruction error. In Proceedings of the Twenty-Ninth International Conference on International Joint Conferences on Artificial Intelligence (pp. 876-882).

The post Spatially Constrained Search in Optical Flow Networks appeared first on Celia Cintas.

Completion of ceramics with generative models for Cultural Heritage studies

Celia Cintas — Thu, 13 Jul 2023 10:15:15 +0000

Why is it important to study fragments in Cultural Heritage Studies?

Ceramic potteries are one of the most frequently discovered archaeological artifacts. Since they are usually short-lived (in archeological terms), researchers find these artifacts useful to analyze chronological and geographical features, given that shape and decoration are subject to significant changes over time and space [0]. This analysis gives a basis for dating the archaeological strata and provides evidence from a large set of valuable data, such as local production, trade relations, and consumer behavior of the local population. Unfortunately, ceramics are fragile; therefore, most of the actual ceramics recovered from archaeological sites are fractured, so the vast majority of the available material appears in fragments. The reassembly of the fragments is a daunting, delicate, and time-consuming task, done almost exclusively by hand, which requires the physical manipulation of the fragments, which ideally should be as short as possible for conservation purposes.

Iberian Pottery Data

In this project, we used a collection corresponding to Iberian wheel-made pottery from various archaeological sites of the upper valley of the Guadalquivir River (Spain). The ceramics are classified into eleven different classes based on their shape. These classes consider the forms of the lip, neck, body, base, and handles and the relative ratios between their sizes. Nine of these classes correspond to closed pottery shapes, and two others belong to open ones [1].

Given that the existing dataset is based on wheel-made pottery techniques, we can assume small asymmetric perturbations in the ceramic. We generate each 3D model as a solid of revolution. First, we extract the shape information from the profile by means of semilandmarks equally spaced in the contours. Then we do the spin, converting the semilandmarks capturing the 2D pottery shape information into a mesh. Lastly, we transform the mesh into a set of voxels.

After we have our complete voxelized model, we fragment it using a method based on the Discrete Voronoi Chain (DVC) algorithm. The DVC algorithm is composed of two steps. First, it generates a random list of Voronoi region centers in the model, each corresponding to a fragment. Second, it assigns the voxel to a section by following a region-growing approach taking each center as a seed until all the voxels have been traversed. However, this procedure can assign voxels in the border between two regions to an incorrect one. Therefore, an additional distance check to the centers of each region is required to guarantee the correct assignment. The resulting number of fragments depends on the type of vessel (closed or open).

Generative Adversarial Networks

A typical GAN [2] framework contains a generative (G) and a discriminative (D) neural network such that G(z) aims (in our context) to generate realistic artifacts, while D(x) learns to discriminate if a sample is from the real data distribution or not.
In our case, z is a voxelized input fragment, different from [3], and G(z) represents the generator function that maps the fragment z to the data space of a complete Iberian voxelized pottery.
This type of network is named Autoencoding GAN (AE-GAN), in which G added a network of encoders that are trained to learn the mapping of each fragment sample to a point in latent space [4], and the decoder learns to map each point to a complete pottery.
The data element x, corresponds to a three-dimensional binary array containing the voxelized pottery geometry. D(G(z)) is the probability that the output of the generator G is a real artifact from the Iberian pottery 3D dataset.
D tries to maximize log(D(x)), which is the probability of having a correct classification of actual voxelized real ceramics, while G tries to minimize log(1 – D(G(z)), which is the probability of D recognizing any of the generated outputs by G.

Results

In order to evaluate the performance under different relative sizes of fragments, we stratified our fragment generation into three main groups: 15-20%, 20-30%, and 30-100% of the initial ceramic model. We can observe that the averaged MSE=0.06 and DSC=0.70 across all classes for the completion of the ceramic is consistent across fragment sizes. This means that our AE-GAN can complete ceramics from smaller to larger initial fragments. Interestingly, we can observe that specific pottery classes are easier to reconstruct across all sizes, such as Classes 10 and 11. We can hypothesize that the improvement over the averaged performance of the method over these classes is because we have more examples of these classes in our dataset. Furthermore, these classes correspond to a homogeneous type of open shapes (e.g., plates).

We also used a pottery classifier; this model yields the same performance in test datasets when evaluating real voxelized potteries and completed potteries from three different groups sizes of fragments by our proposed method.

Lastly, we did a domain experts’ case study that yielded that the reconstructed ceramics’ quality perceived by the archaeologists had a mean score of 2.09 with a standard deviation of 0.61 (with range values between [0, 3]). Additionally, we were interested in evaluating if the Iberian style holds in the completed samples. The archaeologists considered the ceramics to have an Iberian style in mean score 3.93 (std 1.16), with 5 fully Iberian Style.

From this preliminary study, we can conclude that archaeologists judge that the model generated a correct Iberian style from an initial fragment and also consider that the reconstructed pottery is between Good and Very Good. At the end of the questionnaire, we included a comment section to enable unstructured feedback. The comments across evaluators agree on the need for better visualization tools, such as including a scale factor and improving the edge inspection of the models, as these are key factors while evaluating the Iberian Style.

Limitations

Asymmetry in generated potteries: Some completed samples do not have adequate symmetry. The pottery used for this work corresponds to lathe potteries, and those are symmetrical by design. This type of problem is caused by the lack of examples in some classes.

Irregularity in generated potteries: For example, incomplete reconstructions or the mismatching alignment of the fragment with the proposed completion model.

Takeaway and Resources

Results suggest that we can generate potteries that conform to the structure of Iberian ceramics and fulfill experts’ validation criteria. Open source code and 3D dataset are available! Future work will focus on mesh directly and weighted fracture simulations to account for class imbalance. More details in this presentation.

Look out for our work @ IJCAI 2023

Thursday, 24th August at 11:45-12:45 (AI and Arts: Arts, Design and Crafts Session)

#ARTS2515 IberianVoxel: Automatic Completion of Iberian Ceramics for Cultural Heritage Studies. Pablo Navarro; Celia Cintas; Manuel Lucena; José Manuel Fuertes; Antonio Rueda; Rafael Segura; Carlos Ogayar-Anguita; Rolando González-José; Claudio Delrieux

References

[0] Eslami, D., Di Angelo, L., Di Stefano, P. and Pane, C., 2020. Review of computer-based methods for archaeological ceramic sherds reconstruction. Virtual Archaeology Review, 11(23), pp.34-49.

[1] Lucena, M., Fuertes, J.M., Martínez-Carrillo, A.L., Ruiz, A. and Carrascosa, F., 2017. Classification of archaeological pottery profiles using modal analysis. Multimedia Tools and Applications, 76, pp.21565-21577.

[2] Goodfellow, I., Pouget-Abadie, J., Mirza, M., Xu, B., Warde-Farley, D., Ozair, S., Courville, A. and Bengio, Y., 2014. Generative adversarial nets. Advances in neural information processing systems, 27.

[3] Wu, J., Zhang, C., Xue, T., Freeman, B. and Tenenbaum, J., 2016. Learning a probabilistic latent space of object shapes via 3d generative-adversarial modeling. Advances in neural information processing systems, 29.

The post Completion of ceramics with generative models for Cultural Heritage studies appeared first on Celia Cintas.

Big Data Africa School 2023

Celia Cintas — Mon, 20 Mar 2023 12:34:05 +0000

The Big Data Africa School aims to introduce fundamental data science tools & techniques to talented young science and engineering graduates across various disciplines interested in developing their skills and knowledge in working efficiently on extremely large datasets in any research environment. The 4th Big Data Africa School allowed students to work on real-life datasets in healthcare, focusing on biomedical imaging.

The students worked in teams and developed data and ML pipelines around problems like model interpretability, segmentation, data augmentation, 2D to 3D reconstruction, and out-of-distribution detection across different biomedical datasets (dermatoscopic and blood microscope images, breast mammograms, cardiovascular magnetic resonance, and X-rays).

During the BDAS 2023, students learned and developed state-of-the-art data and modeling techniques to address different problems in the medical imagining domain. Each team presented daily updates and preliminary results. Ultimately, each team pitched their work (20′ talks), from defining the problem statement, approach, experimental setup, and evaluation to discussing limitations and lessons learned. Additionally, the school included introductory lectures to Python for Data Science and Machine Learning, Community talks, invited technical talks in medical imaging, and communication skill sessions.

The post Big Data Africa School 2023 appeared first on Celia Cintas.

Nairobi Women in Data end of the year Datathon Challenge

Celia Cintas — Wed, 23 Nov 2022 17:34:03 +0000

Nairobi Women in Data, in partnership with IBM Research Africa and Zindi, will be hosting an end-of-year datathon challenge. The datathon will be launched virtually on the 25th of November, with the Datathon ML challenge hosted on Zindi. The physical presentation and awards will be presented at IBM Research Offices on 3rd December.

If you are interested in taking part in the challenge. Please follow the instructions on this form to register your team. https://docs.google.com/forms/d/1bVrEAGBjeonIPhNqCN0eV8uFe9IFr2ahvfginYsgs-c/edit?usp=drivesdk

To you, Data and ML Champs, who would like to be mentors. During the WID’s Datathon, we require a skilled guide for each team. If you feel that this is you, register using the link below;
https://docs.google.com/forms/d/1NEgCXR5y9z5NI5AdmLdv7vY0Xgu9rpgwBTvqYncGOH0/edit?chromeless=1

The post Nairobi Women in Data end of the year Datathon Challenge appeared first on Celia Cintas.

Call for Post-Conference Workshops at ICLR 2023

celiacintas-admin — Sat, 22 Oct 2022 11:56:52 +0000

Happy to announce the CFP for Workshops at ICLR 2023.

Workshops provide an informal, cutting edge venue for discussion of works in progress and future directions. Good workshops have helped to crystallize common problems, explicitly contrast competing frameworks, and clarify essential questions for a subfield or application area. Workshops are a structured means of bringing together people with common interests to form communities. Good workshops should include some form of community building.

Proposals should be submitted through an application using the CMT system.

Important dates for workshop submissions

Workshop Application Open: 19 September 2022
Workshop Application Deadline: 21 October 2022
Workshop Acceptance Notification: 28 November 2022
Suggested Submission Date for Workshop Contributions: 3rd February 2023
Mandatory Accept/Reject Notification Date: 3rd March 2023

The criteria and process by which proposals will be assessed are described in the Guidance for ICLR Workshop Proposals 2023.

The post Call for Post-Conference Workshops at ICLR 2023 appeared first on Celia Cintas.

Call for Presentations

celiacintas-admin — Fri, 22 Jul 2022 11:56:31 +0000

CFP @ Trustworthy AI Workshop @ Deeplearning Indaba 2022

We’re looking for short presentations (10 to 15 minutes) related to:

Audit techniques in data and ML models.
Advances in algorithms and metrics for robust ML.
Uncertainity quantification techniques and Fairness studies.
Applications and research in data and model Privacy/Security.
Methodologies or case studies for explainable and transparent AI.

If you’re interested in presenting you work at TrustAI Workshop, please submit your response here before the 1st of August 2022.

The post Call for Presentations appeared first on Celia Cintas.

CFP of Practical Machine Learning for Developing Countries workshop at ICLR 2022

celiacintas-admin — Tue, 18 Jan 2022 11:55:21 +0000

Happy to announce the CFP of Practical Machine Learning for Developing Countries workshop at ICLR 2022. We encourage contributions that highlight challenges of learning in low resource environments that are typical in developing countries.

Deadline: February 25th 12:00 AM UTC.

Practical Machine Learning for Developing Countries (PML4DC) workshop is a full-day event that has been running regularly for the past 2 years in row at ICLR (past events include PML4DC 2020 and PML4DC 2021). PML4DC aims to foster collaborations and build a cross-domain community by featuring invited talks, panel discussions, contributed presentations (oral and poster) and round-table mixers.

The post CFP of Practical Machine Learning for Developing Countries workshop at ICLR 2022 appeared first on Celia Cintas.

Towards creativity characterization of generative models in the Activation Space

celiacintas-admin — Wed, 03 Mar 2021 13:44:48 +0000

We’re going to be presenting some preliminary results in our work “Towards creativity characterization of generative models via group-based subset scanning” at Synthetic Data Generation Workshop at ICLR’21

Creativity is a process that provides novel and meaningful ideas. Current deep learning approaches open a new direction enabling the study of creativity from a knowledge acquisition perspective. Novelty generation using powerful deep generative models, such as Variational Autoencoders (VAEs) and Generative Adversarial Networks (GANs), have been attempted. However, such models discourage out-of-distribution generation to avoid instability and decrease spurious sample generation, limiting their creative generation potential. We propose group-based subset scanning to quantify, detect, and characterize creative processes by detecting a subset of anomalous node-activations in the hidden layers of generative models. Our experiments on original, typically decoded, and “creatively decoded” (Das et al., 2020) image datasets reveal that the proposed subset scores distribution is more useful for detecting creative processes in the activation space rather than the pixel space. Further, we found that creative samples generate larger subsets of anomalies than normal or non-creative samples across datasets. Also, the node activations highlighted during the creative decoding process are different from those responsible for normal sample generation.

The post Towards creativity characterization of generative models in the Activation Space appeared first on Celia Cintas.

Celia Cintas

Localizing Persona Representations in LLMs

Introduction

Identifying Layers With Strongest Persona Representations

Identifying a Layer’s Activations With Strongest Persona Representations

Significance

Limitations & Future Work

Look out for our work @ AIES 2025 #119 Localizing Persona Representations in LLMs

Spatially Constrained Search in Optical Flow Networks

Introduction

What is Subset Scanning?

Spatially Constrained Search Space in Optical Flow Networks

Experimental Setup

Datasets and Attacks

Results

Conclusion & Next Steps

Look out for our work @ IJCAI 2023

References

Completion of ceramics with generative models for Cultural Heritage studies

Why is it important to study fragments in Cultural Heritage Studies?

Iberian Pottery Data

Generative Adversarial Networks

Results

Limitations

Takeaway and Resources

Look out for our work @ IJCAI 2023

References

Big Data Africa School 2023

Nairobi Women in Data end of the year Datathon Challenge

Call for Post-Conference Workshops at ICLR 2023

Call for Presentations

CFP @ Trustworthy AI Workshop @ Deeplearning Indaba 2022

CFP of Practical Machine Learning for Developing Countries workshop at ICLR 2022

Towards creativity characterization of generative models in the Activation Space